The goal of the script is to merge the two constituent datasets all_areas_nuseds (referred to as NUSEDS) and conservation_unit_census_sites (referred to as CUSS) so each time series in NUSEDS (i.e., fish counts associated to a unique populations, characterized by a unique IndexId/POP_ID & GFE_ID association) can be attributed to a unique conservation unit (CU) in CUSS (characterized by CU_NAME and FULL_CU_IN). But there are multiple inconsistencies between the two datasets that lead to discard a significant amount of data points in NUSEDS, unless fixes are implemented. This script is a attempt to implement such fixes.

The dataset version corresponding to this script is:

Fisheries and Oceans Canada. 2025. NuSEDS - New Salmon Escapement Database System. Pacific Biological Station. Updated in 2025-11-03 and downloaded in January 01, 2026, from https://open.canada.ca/data/en/dataset/c48669a3-045b-400d-b730-48aafe8c5ee6.

IMPORTANT NOTE: This is the script that PSF maintains actively and it is updated annually as new versions of NuSEDS are released. The corresponding script used in Atkinson et al. 2025. Monitoring for fisheries or for fish? Declines in monitoring of salmon spawners continue despite a conservation crisis is accessible at: a_nuseds_collation.Rmd.

1 Import datasets

The NUSEDS (all_areas_nuseds) and CUSS (conservation_unit_census_sites) datasets are imported and duplicated rows are removed. The columns considered for NuSEDS are related to the population, location, fish counts and year of assessment:

## [1] "File imported: All Areas NuSEDS_20251103.csv ; Date modified: 2026-01-08 09:04:18.530449"
## [1] "File imported: Conservation Unit Census Sites_20250528.csv ; Date modified: 2026-01-08 09:09:46.487137"

The list of conservation units (CUs) as shown in the Pacific Salmon Explorer (PSE) is imported. The columns cu_name_pse and cu_name_dfo correspond to CU_NAME in CUSS, cu_index corresponds to FULL_CU_INDEX.

##   region species_name species_qualified cuid             cu_name_pse
## 1 Skeena      Sockeye               SEL  171                Alastair
## 2 Skeena      Sockeye               SEL  174           Ecstall/Lower
## 3 Skeena      Sockeye               SEL  175                Johnston
## 4 Skeena      Sockeye               SEL  176             Kitsumkalum
## 5 Skeena      Sockeye               SEL  177                 Lakelse
## 6 Skeena      Sockeye               SEL  172 Mcdonell/Dennis/Aldrich
##     cu_name_dfo  cu_index
## 1      Alastair SEL-20-01
## 2 Ecstall/Lower SEL-20-04
## 3      Johnston SEL-20-05
## 4   Kitsumkalum SEL-20-06
## 5       Lakelse SEL-20-07
## 6       Aldrich SEL-20-02

The DFO file of the stream locations and coordinates (emailed from Wu Zhipeng, DFO, 09/04/2024) is imported. The columns NME and ID correspond to SYSTEM_SITE and GFE_ID in CUSS, respectively.

## # A tibble: 6 × 4
##   NME               ID X_LONGT           Y_LAT           
##   <chr>          <dbl> <chr>             <chr>           
## 1 STANLEY CREEK   1557 -132.691420767338 53.9746116925906
## 2 DAVIDSON CREEK  1558 -132.682181062338 53.9544135568635
## 3 NADEN RIVER     1559 -132.672499482506 53.9448571061959
## 4 LIGNITE CREEK   1560 -132.601901312242 53.97114945092  
## 5 OTUN RIVER      1561 -132.260106013    54.07328672     
## 6 BILL CREEK      1562 -132.395009112518 53.7494926521361

The DFO hatchery data file (emailed from Brock Ramshaw, DFO, 03/07/2023) is imported because it contains GFE_ID (i.e., STOCK_GFE_ID and REL_GFE_ID) and coordinates that are missing in the above DFO stream location file:

## # A tibble: 6 × 8
##   STOCK_GFE_ID STOCK_WATERBODY_NAME STOCK_LATITUDE STOCK_LONGITUDE REL_GFE_ID
##          <dbl> <chr>                         <dbl>           <dbl>      <dbl>
## 1         1001 TANKEEAH RIVER                 52.3           -128.       1001
## 2         1001 TANKEEAH RIVER                 52.3           -128.       1001
## 3         1001 TANKEEAH RIVER                 52.3           -128.      11611
## 4         1001 TANKEEAH RIVER                 52.3           -128.      11611
## 5         1001 TANKEEAH RIVER                 52.3           -128.       1001
## 6         1001 TANKEEAH RIVER                 52.3           -128.       1001
## # ℹ 3 more variables: REL_WATERBODY_NAME <chr>, REL_LATITUDE <dbl>,
## #   REL_LONGITUDE <dbl>

The regions and CUs shape files as defined in the PSE are imported:

2 Initial modifications

We remove from NUSEDS and CUSS data related to steelhead, Atlantic salmon and Kokanee salmon, corresponding to 980 rows and 133 populations (POP_ID) in NUSEDS and 0 in CUSS.

We create the column IndexId, which the the combination of the species acronym (i.e., CO, CM, CN, PKO, PKE, SX) and POP_ID:

## [1] "CN_48442"  "CO_590"    "CO_1794"   "CO_1844"   "CO_2172"   "PKE_51264"

We create the column MAX_ESTIMATE which is the maximum value of the fish count columns:

## [1] "NATURAL_ADULT_SPAWNERS"    "NATURAL_JACK_SPAWNERS"    
## [3] "NATURAL_SPAWNERS_TOTAL"    "ADULT_BROODSTOCK_REMOVALS"
## [5] "JACK_BROODSTOCK_REMOVALS"  "TOTAL_BROODSTOCK_REMOVALS"
## [7] "OTHER_REMOVALS"            "TOTAL_RETURN_TO_RIVER"

3 Fixes on NUSEDS and CUSS

The major issue in NuSEDS is the discrepancy between the two datasets. Normally, there should be the same population references (i.e., unique IndexId/POP_ID & GFE_ID combinations). But there are 11591 unique population references in NUSEDS and only 7149 in CUSS. Additionally, there are 4447 references in NUSEDS that are not in CUSS and 5 that are in CUSS but not in NUSEDS. This section consists in solving these discrepancies so that the same population references are present in both datasets.

3.1 Duplicated rows

There are 60 duplicated rows in NUSEDS when considering the following fields:

##  [1] "SPECIES"                   "POP_ID"                   
##  [3] "GFE_ID"                    "Year"                     
##  [5] "NATURAL_ADULT_SPAWNERS"    "NATURAL_JACK_SPAWNERS"    
##  [7] "NATURAL_SPAWNERS_TOTAL"    "ADULT_BROODSTOCK_REMOVALS"
##  [9] "JACK_BROODSTOCK_REMOVALS"  "TOTAL_BROODSTOCK_REMOVALS"
## [11] "OTHER_REMOVALS"            "TOTAL_RETURN_TO_RIVER"    
## [13] "ENUMERATION_METHODS"       "ESTIMATE_CLASSIFICATION"
##    SPECIES POP_ID GFE_ID Year MAX_ESTIMATE
## 1  Sockeye  44565   2489 1997           NA
## 2  Sockeye  44565   2489 1998           NA
## 3  Sockeye  44565   2489 2001           NA
## 4  Sockeye  44565   2489 2002           NA
## 5     Coho  44567   2489 1995           NA
## 6     Coho  44567   2489 1996           NA
## 7     Coho  44567   2489 1997            1
## 8     Coho  44567   2489 1998           12
## 9     Coho  44567   2489 1999           NA
## 10    Coho  44567   2489 2000           NA
## 11    Coho  44567   2489 2001           NA
## 12    Coho  44567   2489 2002           NA
## 13    Coho  44567   2489 2003           NA
## 14    Coho  44567   2489 2004           NA
## 15    Coho  44567   2489 2005           NA
## 16    Coho  44567   2489 2007           NA
## 17    Coho  44567   2489 2008           NA
## 18    Coho  44567   2489 2009           NA
## 19    Coho  44567   2489 2010           NA
## 20    Coho  44567   2489 2011           NA
## 21    Coho  44567   2489 2012           NA
## 22    Coho  44567   2489 2013           NA
## 23    Coho  44567   2489 2014           NA
## 24    Coho  44567   2489 2015           NA
## 25    Coho  44567   2489 2016           NA
## 26    Coho  44567   2489 2017           NA
## 27    Coho  44567   2489 2018           NA
## 28    Coho  44567   2489 2019           NA
## 29    Pink  44569   2489 1997           NA
## 30    Pink  44569   2489 1998           NA
## 31    Pink  44569   2489 2001           NA
## 32    Pink  44569   2489 2002           NA
## 33    Chum  44571   2489 1995           NA
## 34    Chum  44571   2489 1996           NA
## 35    Chum  44571   2489 1997          126
## 36    Chum  44571   2489 1998           93
## 37    Chum  44571   2489 1999           NA
## 38    Chum  44571   2489 2000           NA
## 39    Chum  44571   2489 2001           NA
## 40    Chum  44571   2489 2002           NA
## 41    Chum  44571   2489 2003           NA
## 42    Chum  44571   2489 2004           NA
## 43    Chum  44571   2489 2005           NA
## 44    Chum  44571   2489 2008           NA
## 45    Chum  44571   2489 2009           NA
## 46    Chum  44571   2489 2010           NA
## 47    Chum  44571   2489 2011           NA
## 48    Chum  44571   2489 2012           NA
## 49    Chum  44571   2489 2013           NA
## 50    Chum  44571   2489 2014           NA
## 51    Chum  44571   2489 2015           NA
## 52    Chum  44571   2489 2016           NA
## 53    Chum  44571   2489 2017           NA
## 54    Chum  44571   2489 2018           NA
## 55    Chum  44571   2489 2019           NA
## 56 Chinook  44573   2489 1997           NA
## 57 Chinook  44573   2489 1998           NA
## 58 Chinook  44573   2489 2001           NA
## 59 Chinook  44573   2489 2002           NA
## 60    Chum  51316   1174 2011           NA

We assume these are real duplicated rows in NUSEDS so we remove them.

For CUSS, 0 duplicated rows are found:

##  [1] GFE_ID              CENSUS_SITE         GFE_TYPE           
##  [4] SPECIES_QUALIFIED   Y_LAT               X_LONGT            
##  [7] FAZ_ACRO            MAZ_ACRO            JAZ_ACRO           
## [10] CU_NAME             CU_ACRO             CU_LAT             
## [13] CU_LONGT            CU_TYPE             CU_INDEX           
## [16] FULL_CU_IN          SBJ_ID              POP_ID             
## [19] IS_INDICATOR        CMNTS               EFFECTIVE_DT       
## [22] WATERSHED_CDE       FWA_WATERSHED_CDE   coordinates_changed
## [25] SPECIES             species_acronym_ncc IndexId            
## <0 rows> (or 0-length row.names)

These rows (if any) are removed as well.

3.2 Conflictual counts in NUSEDS

There are 2 instances where multiple MAX_ESTIMATE values are available in a same Year for a given population:

##    IndexId GFE_ID Year MAX_ESTIMATE        ESTIMATE_METHOD
## 1 CN_45164   2224 2023  3957.000000             Sonar-ARIS
## 2 CN_45164   2224 2023   462.000000       Peak Live + Dead
## 3 CN_46447    241 2016   494.000000 Calibrated Time Series
## 4 CN_46447    241 2016   475.414455 Calibrated Time Series
##       ESTIMATE_CLASSIFICATION
## 1     TRUE ABUNDANCE (TYPE-1)
## 2 RELATIVE ABUNDANCE (TYPE-4)
## 3 RELATIVE ABUNDANCE (TYPE-4)
## 4 RELATIVE ABUNDANCE (TYPE-4)

For CN_45164, we keep the observation with the largest count because the method associated is better (TYPE 1 vs. TYPE 4).

For CN_46447, there are different CREATED_DTT and UPDATED_DTT for the two entries:

##    IndexId GFE_ID Year NATURAL_ADULT_SPAWNERS        ESTIMATE_METHOD
## 1 CN_46447    241 2016             494.000000 Calibrated Time Series
## 2 CN_46447    241 2016             475.414455 Calibrated Time Series
##   CREATED_DTT UPDATED_DTT
## 1   06-AUG-21   26-MAR-25
## 2   16-APR-24

We keep the observation with the most recent entry: the one with UPDATED_DTT = 26-MAR-25.

3.3 Location without Adaptive Zone data in CUSS

There is one instance of a location (GFE_ID) that does not have values for FAZ_ACRO,MAZ_ACRO,JAZ_ACRO:

##       CENSUS_SITE GFE_ID FAZ_ACRO MAZ_ACRO JAZ_ACRO
## 1 PUNTLEDGE RIVER   1156      EVI     GStr EVI+GStr
## 2 PUNTLEDGE RIVER   1156

We edit CUSS accordingly.

3.4 Missing coordinates in CUSS

There are 9 locations without coordinates in any of the source files and for which we have to define coordinates manually (cf., code to see the sources when available):

##                                     CENSUS_SITE     GFE_ID     X_LONGT
## 1                            NICOLA RIVER (DAM)      54282 -120.665873
## 2                            NICOLA RIVER (DOT)      54283 -121.315659
## 3                          BIRCH ISLAND CHANNEL      33045 -119.889197
## 4                            BLUE RIVER - UPPER 1921661712 -119.289006
## 5          MIDDLE NORTH THOMPSON RIVER MAINSTEM 1754239945 -119.701668
## 6                          PIG CHANNEL COMBINED  719256264  119.810014
## 7           UPPER NORTH THOMPSON RIVER MAINSTEM 2049268617 -119.177796
## 8 UPPER NORTH THOMPSON RIVER-MILEDGE CONFLUENCE      57041 -119.173514
## 9                       CHILCOTIN RIVER - UPPER       2462 -124.028396
##       Y_LAT
## 1 50.163045
## 2 50.424454
## 3 51.598602
## 4 52.112252
## 5 51.593091
## 6 51.579465
## 7 52.345236
## 8 52.281312
## 9 52.339295

3.5 Remove the IndexId & GFE_ID time series in NUSEDS with only NAs and/or 0s

We remove these time series to reduce the amount of cleaning to do after. There is the option to remove series with (1) only NAs and 0s or (2) only NAs.

Decision made:

## [1] "Time series uniqually made of NAs are removed (those also containing 0s are kept)."

This procedure removes 102104 rows in NUSEDS, corresponding to 24.4% of the original dataset.

The data removed concerns 4425 time series, which are referenced in removed_all:

##   IndexId GFE_ID dataset                   comment
## 1  CN_134   2615  NUSEDS Only NAs for MAX_ESTIMATE
## 2  SX_142   2615  NUSEDS Only NAs for MAX_ESTIMATE
## 3  CN_144   2624  NUSEDS Only NAs for MAX_ESTIMATE
## 4  SX_152   2624  NUSEDS Only NAs for MAX_ESTIMATE
## 5  CN_154   2625  NUSEDS Only NAs for MAX_ESTIMATE
## 6  SX_162   2625  NUSEDS Only NAs for MAX_ESTIMATE

Note that we do not remove the references of these series (i.e., their IndexId & GFE_ID association) in CUSS in order to not loose information.

3.6 Fix populations in NUSEDS that are not in CUSS

The goal is to look for each IndexId/POP_ID & GFE_ID series in NUSEDS that are not in CUSS and to proceed as follow:

  1. The series with a IndexId/POP_ID associate with a FULL_CU_IN (field added in NUSEDS since 2025-11-03) are added to CUSS. The step might require to add a new IndexId/POP_ID and/or GFE_ID to CUSS.

  2. The missing FULL_CU_IN (and consequently CU_NAME and other related fields) of the remaining series is found by intersecting the coordinates (X_LONGT and Y_LAT) associated with the GFE_ID with the PSE’s CUs’ shape files. Several GFE_ID are missing coordinates so we look in the DFO stream file and the DFO hatchery file as alternative sources. Once associated to a CU, these series are added to CUSS.

There are 264 IndexId & GFE_ID time series in NUSEDS whose references are not present in CUSS, among which 53 already have a FULL_CU_IN.

Among the 194 locations (GFE_ID) associated to these series, 127 are present with coordinates in CUSS, 40 had to be found in the DFO file of the stream locations, 4 were found in the DFO hatchery file and 23 were not found anywhere. In the next section, we attribute coordinates to the latter manually.

3.6.1 GFE_ID with missing X_LONGT and Y_LAT

There are 23 locations with coordinates we manually defined using Google Maps or any resource we could find.

There are two Clearwater Creek WATERBODY associated to Phillips River (Campbell River Area) POPULATION in NUSEDS:

##          WATERBODY    GFE_ID
## 1 CLEARWATER CREEK      2618
## 2 Clearwater Creek 446068199

None of the two GFE_ID are in CUSS

cond <- conservation_unit_census_sites$GFE_ID %in% c(2618,446068199)
show <- unique(conservation_unit_census_sites[cond,c("CENSUS_SITE","GFE_ID","X_LONGT","Y_LAT")])
show <- as.data.frame(show)
rownames(show) <- NULL
show
## [1] CENSUS_SITE GFE_ID      X_LONGT     Y_LAT      
## <0 rows> (or 0-length row.names)

One of them is in the DFO file of the stream locations:

cond <- DFO_All_Streams_Segments$ID %in% c(2618,446068199)
show <- unique(DFO_All_Streams_Segments[cond,c("NME","ID","X_LONGT","Y_LAT")])
show <- as.data.frame(show)
rownames(show) <- NULL
show
##                NME   ID        X_LONGT         Y_LAT
## 1 CLEARWATER CREEK 2618 -125.276480824 50.6158337647

By comparing all the time series having the same indexId/POP_ID with those two GFE_ID, we can see all of them can be merged:

We consequently assume that the two locations are the same and we merge the time time series by replacing the GFE_ID value without coordinates (446068199) by the one with coordinates (2618) in NUSEDS.

## [1] "Series SX_50610 - 446068199 merged to series SX_50610 - 2618 in NUSEDS"
## [1] "3 rows were edited in all_areas_nuseds at the following fields: AREA, WATERBODY, GAZETTED_NAME, LOCAL_NAME_1, LOCAL_NAME_2, WATERSHED_CDE, WATERBODY_ID, GFE_ID, FWA_WATERSHED_CDE"
## [1] ""
## [1] "Series CO_50612 - 446068199 merged to series CO_50612 - 2618 in NUSEDS"
## [1] "1 rows were edited in all_areas_nuseds at the following fields: AREA, WATERBODY, GAZETTED_NAME, LOCAL_NAME_1, LOCAL_NAME_2, WATERSHED_CDE, WATERBODY_ID, GFE_ID, FWA_WATERSHED_CDE"
## [1] ""
## [1] "Series CM_50616 - 446068199 merged to series CM_50616 - 2618 in NUSEDS"
## [1] "1 rows were edited in all_areas_nuseds at the following fields: AREA, WATERBODY, GAZETTED_NAME, LOCAL_NAME_1, LOCAL_NAME_2, WATERSHED_CDE, WATERBODY_ID, GFE_ID, FWA_WATERSHED_CDE"
## [1] ""
## [1] "Series CN_50618 - 446068199 merged to series CN_50618 - 2618 in NUSEDS"
## [1] "1 rows were edited in all_areas_nuseds at the following fields: AREA, WATERBODY, GAZETTED_NAME, LOCAL_NAME_1, LOCAL_NAME_2, WATERSHED_CDE, WATERBODY_ID, GFE_ID, FWA_WATERSHED_CDE"
## [1] ""
## [1] "Series PKO_50614 - 446068199 merged to series PKO_50614 - 2618 in NUSEDS"
## [1] "2 rows were edited in all_areas_nuseds at the following fields: AREA, WATERBODY, GAZETTED_NAME, LOCAL_NAME_1, LOCAL_NAME_2, WATERSHED_CDE, WATERBODY_ID, GFE_ID, FWA_WATERSHED_CDE"
## [1] ""

There is a typo for Cariboo river:

cond <- all_areas_nuseds$GFE_ID == 2467
show <- unique(all_areas_nuseds[cond,c("WATERBODY","GFE_ID","IndexId","POPULATION")])
rownames(show) <- NULL
show
##                WATERBODY GFE_ID  IndexId
## 1 CARRIBOO RIVER (LOWER)   2467 CN_46891
##                                         POPULATION
## 1 Cariboo River (Williams Lake Area) Chinook Run 1

We correct it to CARIBOO RIVER - LOWER in NUSEDS to match we other notations of the same river:

cond <- grepl("CARRIBOO",all_areas_nuseds$WATERBODY)
all_areas_nuseds$WATERBODY[cond] <- "CARIBOO RIVER - LOWER"

cond <- grepl("CARRIBOO",trackRecord_nuseds_nocuss$WATERBODY)
trackRecord_nuseds_nocuss$WATERBODY[cond] <- "CARIBOO RIVER - LOWER"

cond <- grepl("CARIBOO",all_areas_nuseds$WATERBODY)
show <- unique(all_areas_nuseds[cond,c("WATERBODY","GFE_ID")])
rownames(show) <- NULL
show
##               WATERBODY GFE_ID
## 1 CARIBOO RIVER - UPPER   2466
## 2         CARIBOO RIVER    290
## 3 CARIBOO RIVER - LOWER   2467

Here are the locations for which we manually defined coordinates:

##       GFE_ID                WATERBODY     X_LONGT     Y_LAT
## 1      11487  STAMP RIVER BELOW FALLS -124.911882 49.327917
## 2      11488 SPROAT RIVER BELOW FALLS -124.902346 49.290073
## 3    7990588    RETURN CHANNEL CREEKS -128.135232 52.276281
## 4       1303     KENNEDY LAKE BEACHES -125.580441 49.049133
## 5      52749          SILVERMERE LAKE -122.408753 49.174967
## 6      55417            CAMILOS CREEK -121.393056 49.386667
## 7      57042             HANSEN CREEK -128.346389 50.778889
## 8       2464 COTTONWOOD RIVER - LOWER -122.610584 53.119965
## 9       2467    CARIBOO RIVER - LOWER -121.444278 52.721027
## 10     19723     SKEENA RIVER - NORTH -128.636866 54.501018
## 11       432    WIMINASIK LAKE SYSTEM -128.049384 56.489746
## 12      2461     NICOLA RIVER - LOWER -121.021924 50.145281
## 13        95         SEMMIHAULT CREEK -121.956111 49.153056
## 14      2681           MCLELLAN CREEK -126.619731  52.38783
## 15     67553             GIBSON CREEK -123.499173 49.406058
## 16       208         COTTONWOOD CREEK -123.145631 49.762523
## 17       199                FEE CREEK -123.342487 50.068778
## 18 489440637     Atlatzi+Clear Rivers  -126.11918 51.015775
## 19      3505              OLSEN CREEK -122.631537 49.597168
## 20      1307    CLAYOQUOT ARM BEACHES -125.591878 49.099033
## 21      2792               MOORE LAKE -129.501134  53.40953
## 22     64745            ATNARKO LAKES -125.713941 52.163834

3.6.2 Find the CU of population without a FULL_CU_ID

There are 208 IndexId/POP_ID not associated to a CU (and 211 time series):

##          IndexId                                              POPULATION
## 1        CM_1622                      Maple Creek (Coquitlam) Chum Run 1
## 2        CM_2434                     Stoney Creek (Coquitlam) Chum Run 1
## 3         CM_297                   Alouette River (Coquitlam) Chum Run 2
## 4        CM_3212                                    Clayoquot River Chum
## 5       CM_41256                                      Satchie Creek Chum
## 6       CM_41371                                     Oktwanch River Chum
## 7       CM_42458                             Tlell River (Sandspit) Chum
## 8       CM_43568                         Cascade Creek (Lower Nass) Chum
## 9       CM_44524                               Hesquiat Point Creek Chum
## 10      CM_44558                                    Muriel S1 Creek Chum
## 11      CM_45283                                         Draw Creek Chum
## 12      CM_45393       Carrington Creek (Campbell River Area) Chum Run 1
## 13      CM_45644             Atnarko Spawning Channel (Bella Coola) Chum
## 14      CM_45931                Return Channel Creeks (Bella Bella) Chum
## 15   CM_46787933                                  Muriel Lake Creek Chum
## 16      CM_47648                       Damshilgwit Creek (Smithers) Chum
## 17      CM_51026              Health Lagoon Creek (Alert Bay) Chum Run 1
## 18      CM_51161                   Carriden Creek (Alert Bay) Chum Run 1
## 19      CM_52086                                       Kennedy Lake Chum
## 20      CM_52096                        Kennedy Lake Feeder Streams Chum
## 21      CM_52171                                         Cold Creek Chum
## 22      CM_52186                                       Angora Creek Chum
## 23        CM_562                      Byrne Creek (Coquitlam) Chum Run 1
## 24       CM_7009                      Irgens Creek (Pender Harbour) Chum
## 25       CM_7734                               Coho Creek (Mission) Chum
## 26       CM_7735                              Eagle Creek (Mission) Chum
## 27       CM_7736                          Holachten Creek (Mission) Chum
## 28       CM_7737                            Mystery Creek (Mission) Chum
## 29       CM_7739                          Silvermere Lake (Mission) Chum
## 30       CM_7765                                Coho Creek (Tofino) Chum
## 31       CM_7768                           Cultus Lake (Chilliwack) Chum
## 32       CM_7769                         Camilos Creek (Chilliwack) Chum
## 33       CM_7774                        English Cove Creek (Tofino) Chum
## 34       CM_7775                      Noosgulch River (Bella Coola) Chum
## 35       CM_7784                      Hansen Creek (Campbell River) Chum
## 36       CM_7796              Fancy Cove (Head) Creek (Bella Bella) Chum
## 37       CM_7801                      Colvin Creek (Pender Harbour) Chum
## 38       CM_7812          Fancy Cove Right Hand Creek (Bella Bella) Chum
## 39       CM_7816                           Fell Creek (Bella Bella) Chum
## 40       CM_7827                                      Malcolm Creek Chum
## 41       CN_3213                                 Clayoquot River Chinook
## 42       CN_3320                               Lower Stamp River Chinook
## 43       CN_3324                                     Upper Stamp Chinook
## 44       CN_3334                  Nanaimo River Chinook (unknown timing)
## 45      CN_41103                           White Pine Cove Creek Chinook
## 46      CN_45645          Atnarko Spawning Channel (Bella Coola) Chinook
## 47      CN_46658                   Dunn Creek (Clearwater) Chinook Run 1
## 48      CN_46668              Mctaggart Creek (Clearwater) Chinook Run 1
## 49      CN_46841      Chilcotin River (Williams Lake Area) Chinook Run 1
## 50      CN_47156                     Anderson River (Chilliwack) Chinook
## 51      CN_49102      Hanson's Creek (Campbell River Area) Chinook Run 1
## 52      CN_50009               Upper Paradise Channel (Squamish) Chinook
## 53      CN_50319                   Hugh Creek (Kitimat-Butedale) Chinook
## 54      CN_50638        Grassy Creek (Campbell River Area) Chinook Run 1
## 55      CN_50698       Wortley Creek (Campbell River Area) Chinook Run 1
## 56       CN_7809              Okanagan River (Salmon Arm) Summer Chinook
## 57       CN_7834             Whiminasik Lake System (Upper Nass) Chinook
## 58       CO_1624                            Maple Creek (Coquitlam) Coho
## 59       CO_1826                           Nevin Creek (Chilliwack) Coho
## 60       CO_3027                  Wathlsto Creek (Kitimat-Butedale) Coho
## 61       CO_3211                                    Clayoquot River Coho
## 62       CO_3321                                        Upper Stamp Coho
## 63      CO_43196          Douglas Inlet Head Creek (West Coast QCI) Coho
## 64      CO_43616                         Donahue Creek (Lower Nass) Coho
## 65      CO_44441                              Deep Creek (Smithers) Coho
## 66      CO_44840                            Crag Creek (Lower Nass) Coho
## 67      CO_44988                          Palmer Creek (Salmon Arm) Coho
## 68      CO_45303                         Klemtu Creek (Bella Bella) Coho
## 69      CO_45642             Atnarko Spawning Channel (Bella Coola) Coho
## 70      CO_45879        Whaletown Creek (Campbell River Area) Coho Run 1
## 71      CO_46310                            Ross Creek (Salmon Arm) Coho
## 72      CO_46855                        Taseko Lake (Williams Lake) Coho
## 73      CO_46865                        Elkin Creek (Williams Lake) Coho
## 74      CO_47232                           Yalakom River (Lillooet) Coho
## 75      CO_48090                            Shannon Creek (Terrace) Coho
## 76      CO_48205                          Anudol Creek (Upper Nass) Coho
## 77      CO_49415                       Lower Tahlo Creek (Smithers) Coho
## 78      CO_52082                                       Kennedy Lake Coho
## 79      CO_52242                                       Meares Creek Coho
## 80      CO_52262                                        Sharp Creek Coho
## 81      CO_52272                                    Cone Creeks (2) Coho
## 82      CO_52376                     Mckay Creek (Kitimat-Butedale) Coho
## 83        CO_572                         Calkins Creek (Chilliwack) Coho
## 84       CO_7420                           Teto Creek (Salmon Arm) Coho 
## 85       CO_7729                  Clayton Falls Creek (Bella Coola) Coho
## 86       CO_7730                             Bings Creek (Cowichan) Coho
## 87       CO_7747                                 Bear Creek (Comox) Coho
## 88       CO_7759                           Averill Creek (Cowichan) Coho
## 89       CO_7764                                Coho Creek (Tofino) Coho
## 90       CO_7771                    Blue Lead Creek (Williams Lake) Coho
## 91       CO_7772                        Wasko Creek (Williams Lake) Coho
## 92       CO_7773                        English Cove Creek (Tofino) Coho
## 93       CO_7776                 Jenny Bay West Creek (Bella Coola) Coho
## 94       CO_7779                         Quilchena Creek (Lillooet) Coho
## 95       CO_7780                    Hazeltine Creek (Williams Lake) Coho
## 96       CO_7793                       Mclellan Creek (Bella Coola) Coho
## 97       CO_7802                      Colvin Creek (Pender Harbour) Coho
## 98       CO_7814          Fannie Cove Left Hand Creek (Bella Bella) Coho
## 99       CO_7826                                      Malcolm Creek Coho
## 100      CO_7828                                       Gibson Creek Coho
## 101       CO_792                  Cottonwood Creek (Squamish) Coho Run 1
## 102       CO_959                         Fee Creek (Squamish) Coho Run 1
## 103      PKE_180                 Myers Creek (Pender Harbour) Pink Run 1
## 104     PKE_3167                   Nooklikonnik Creek (Bella Coola) Pink
## 105     PKE_3168                       Nusatsum River (Bella Coola) Pink
## 106     PKE_3277                                          Dak River Pink
## 107     PKE_3432                                   Gobeil Bay Creek Pink
## 108     PKE_3447         Fannie Cove Right Hand Creek (Bella Bella) Pink
## 109    PKE_45304                         Klemtu Creek (Bella Bella) Pink
## 110    PKE_45454                             Babine Lake (Smithers) Pink
## 111    PKE_45643             Atnarko Spawning Channel (Bella Coola) Pink
## 112    PKE_45930                Return Channel Creeks (Bella Bella) Pink
## 113    PKE_48638                     Roberts Creek (Pender Harbour) Pink
## 114    PKE_48658                      Wilson Creek (Pender Harbour) Pink
## 115    PKE_50012                        Tenderfoot Creek (Squamish) Pink
## 116    PKE_51064                Wahkana Bay Creek (Alert Bay) Pink Run 1
## 117    PKE_51314                                         Cook Creek Pink
## 118    PKE_51474                    Departure Creek (Nanaimo) Pink Run 1
## 119    PKE_51882                        Rainbow Creek (Bella Bella) Pink
## 120    PKE_52377                     Mckay Creek (Kitimat-Butedale) Pink
## 121    PKE_52387                   Tetlock Creek (Kitimat-Butedale) Pink
## 122    PKE_52392                    Hunter Creek (Kitimat-Butedale) Pink
## 123    PKE_52492                   Alvin Creek (Grenville Principe) Pink
## 124    PKE_52754             Owen Creek (Campbell River Area) Pink Run 1
## 125    PKE_53044                         Little River (Comox) Pink Run 1
## 126 PKE_54793433               Dominic Creek (Quatsino Sound) Pink Run 1
## 127     PKE_7797              Fancy Cove (Head) Creek (Bella Bella) Pink
## 128     PKE_7824                      Colvin Creek (Pender Harbour) Pink
## 129 PKE_88092422                             Dasque Creek (Terrace) Pink
## 130      PKO_180                 Myers Creek (Pender Harbour) Pink Run 1
## 131     PKO_3195                                  Whitebottom Creek Pink
## 132     PKO_3277                                          Dak River Pink
## 133    PKO_40049                Goldstream River (Victoria/Saanich) Pink
## 134    PKO_41037                              Kumdis Creek (Masset) Pink
## 135    PKO_43007                       Bag Harbour Creek (Sandspit) Pink
## 136    PKO_44654                           Gaspard Creek (Lillooet) Pink
## 137    PKO_45643             Atnarko Spawning Channel (Bella Coola) Pink
## 138    PKO_45930                Return Channel Creeks (Bella Bella) Pink
## 139    PKO_49088              Store Creek (Powell River Area) Pink Run 1
## 140    PKO_50007                  Upper Paradise Channel (Squamish) Pink
## 141    PKO_50012                        Tenderfoot Creek (Squamish) Pink
## 142    PKO_50154                 South Twin Creek (Vancouver) Pink Run 1
## 143    PKO_52377                     Mckay Creek (Kitimat-Butedale) Pink
## 144    PKO_52392                    Hunter Creek (Kitimat-Butedale) Pink
## 145    PKO_52492                   Alvin Creek (Grenville Principe) Pink
## 146    PKO_52754             Owen Creek (Campbell River Area) Pink Run 1
## 147    PKO_53044                         Little River (Comox) Pink Run 1
## 148     PKO_7792                      Noosgulch River (Bella Coola) Pink
## 149     PKO_7797              Fancy Cove (Head) Creek (Bella Bella) Pink
## 150     PKO_7811           Bullock Channel East Creek (Bella Bella) Pink
## 151     PKO_7813          Fancy Cove Right Hand Creek (Bella Bella) Pink
## 152     PKO_7817                           Fell Creek (Bella Bella) Pink
## 153 PKO_88092422                             Dasque Creek (Terrace) Pink
## 154      SX_1874      Olsen Creek (Coquitlam) Early Summer Sockeye Run 1
## 155      SX_3077                  Butterfield Creek Early Stuart Sockeye
## 156      SX_3085                    Tarnezell Creek Early Stuart Sockeye
## 157      SX_3092                       Nancut Creek Early Stuart Sockeye
## 158      SX_3221                          Sus Creek Early Summer Sockeye
## 159      SX_3438                               Loftus Creek Late Sockeye
## 160     SX_39730               East Creek (Quatsino Sound) Sockeye Run 1
## 161     SX_40015                   Cowichan River (Duncan) Sockeye Run 1
## 162     SX_42680 Security Inlet Left Hand Creek (West Coast QCI) Sockeye
## 163     SX_42950    Werner Bay South Right Hand Creek (Sandspit) Sockeye
## 164     SX_43095          Louscoone Inlet Creek (West Coast QCI) Sockeye
## 165     SX_43770                Webster Lake Creek (Bella Bella) Sockeye
## 166     SX_44140         Gribble Island Creek (Kitimat-Butedale) Sockeye
## 167     SX_44160           Fishtrap Bay Creek (Kitimat-Butedale) Sockeye
## 168     SX_44165        Little Tillhorn Creek (Kitimat-Butedale) Sockeye
## 169     SX_45013          Burton Creek (Clearwater) Early Summer Sockeye
## 170     SX_45038          Raft Creek (Williams Lake Area) Summer Sockeye
## 171     SX_45040       Winkley Creek (Williams Lake Area) Summer Sockeye
## 172     SX_45056          Amos Creek (Williams Lake Area) Summer Sockeye
## 173     SX_45084    Upper Momich River (Clearwater) Early Summer Sockeye
## 174     SX_45641          Atnarko Spawning Channel (Bella Coola) Sockeye
## 175     SX_46248   Adams River (Upper) (Clearwater) Early Summer Sockeye
## 176     SX_46268        Momich/Cayenne (Clearwater) Early Summer Sockeye
## 177     SX_46278         Cayenne Creek (Clearwater) Early Summer Sockeye
## 178     SX_46279                 Cayenne Creek (Clearwater) Late Sockeye
## 179     SX_46308            Ross Creek (Salmon Arm) Early Summer Sockeye
## 180     SX_48264                       Tseax Slough (Upper Nass) Sockeye
## 181     SX_50005               Upper Paradise Channel (Squamish) Sockeye
## 182     SX_50010                     Tenderfoot Creek (Squamish) Sockeye
## 183     SX_50340                  Wathl Creek (Kitimat-Butedale) Sockeye
## 184     SX_51000                 Gilford Creek (Alert Bay) Sockeye Run 1
## 185     SX_51060             Wahkana Bay Creek (Alert Bay) Sockeye Run 1
## 186     SX_51145                  Embley Creek (Alert Bay) Sockeye Run 1
## 187     SX_51155                Carriden Creek (Alert Bay) Sockeye Run 1
## 188     SX_52080                                    Kennedy Lake Sockeye
## 189     SX_52120                           Clayoquot Arm Beaches Sockeye
## 190     SX_52390                 Hunter Creek (Kitimat-Butedale) Sockeye
## 191     SX_52750          Owen Creek (Campbell River Area) Sockeye Run 1
## 192     SX_52890       Menzies Creek (Campbell River Area) Sockeye Run 1
## 193     SX_52960        Willow Creek (Campbell River Area) Sockeye Run 1
## 194      SX_7490                                    Middle River Sockeye
## 195      SX_7491                                  Stephens Creek Sockeye
## 196      SX_7639               Four Mile Creek (Salmon Arm) Late Sockeye
## 197      SX_7640                   Hlina Creek (Salmon Arm) Late Sockeye
## 198      SX_7689             Blue Creek (Coquitlam) Early Summer Sockeye
## 199      SX_7694          Cypress Creek (Coquitlam) Early Summer Sockeye
## 200      SX_7805                 Moore Lake (Grenville Principe) Sockeye
## 201      SX_7807                           Sawki Creek (Mission) Sockeye
## 202      SX_7815       Fannie Cove Left Hand Creek (Bella Bella) Sockeye
## 203      SX_7818                                   Atnarko Lakes Sockeye
## 204      SX_7822                                    Joffre Creek Sockeye
## 205      SX_7823       Ormond Creek (Prince George) Early Summer Sockeye
## 206      SX_7833             Whiminaski Lake System (Upper Nass) Sockeye
## 207      SX_7835                     Illiance River (Lower Nass) Sockeye
## 208  SX_80808204                          Dasque Creek (Terrace) Sockeye

We identify the CU they belong to using the CUs’ shape files of the PSE. Series with X_LONGT and Y_LAT values that intersects one unique CU layer of the same species are attributed the CU’s FULL_CU_IN. When more than one CU layer is intersected, we use the information in POPULATION and WATERBODY to manually select the correct CU if possible.

Below we show the 0 time series for which we could not attribute a CU:

## [1] "There are not any."

There are 3 CUs (5 time series) having a cu_name_pse and cuid but for who we could not find a CU_NAME and FULL_CU_IN in CUSS:

## [1] "The series below is/are associated to CU Upper Skeena (cuid = 242) in the PSE:"

## [1] "The series below is/are associated to CU Adams-Early Summer (cuid = 760) in the PSE:"

## [1] "The series below is/are associated to CU Momich-Early summer (cuid = 761) in the PSE:"

Finally, there are 2 time series with a CU_NAME and FULL_CU_IN that are not in CUSS and for which we could not find the corresponding cuid and cu_name_pse in the PSE:

##   IndexId GFE_ID FULL_CU_IN          WATERBODY       Y_LAT      X_LONGT
## 1 SX_3302   3416  SEL-13-xx GREAT CENTRAL LAKE 49.32764265 -124.9826436
## 2 SX_3302   3444  SEL-13-xx        SPROAT LAKE 49.28801087 -124.9153347
##   coordinates_from            POPULATION              CU_NAME species_name cuid
## 1             CUSS Somass System Sockeye GREAT CENTRAL/SPROAT      Sockeye   NA
## 2             CUSS Somass System Sockeye GREAT CENTRAL/SPROAT      Sockeye   NA
##   cu_name_pse
## 1        <NA>
## 2        <NA>
##                                                                  comment
## 1 CUID not found: IndexId = SEL-13-xx and CU_NAME = GREAT CENTRAL/SPROAT
## 2 CUID not found: IndexId = SEL-13-xx and CU_NAME = GREAT CENTRAL/SPROAT
##   comment_cu
## 1       <NA>
## 2       <NA>
## [1] "The series below have CU_NAME = GREAT CENTRAL/SPROAT and FULL_CU_IN = SEL-13-xx"

The reference of all these time series (i.e., IndexId/POP_ID & GFE_ID and associated fields) are added to CUSS in the next step.

3.6.3 Add the references of the missing time series to CUSS

First we add to CUSS the references of the 252 time series for which we found the CU_NAME and FULL_CU_IN in CUSS and the cuid and cu_name_pse in the PSE.

Second we add to CUSS the references of the 5 time series for which we did not find the CU_NAME and FULL_CU_IN but for which we found the cuid and cu_name_pse.

Finally, the 2 time series with a CU_NAME and FULL_CU_IN for which we could not find the corresponding cuid and cu_name_pse are added to CUSS.

So now all the time series in NUSEDS should be referenced in CUSS, is that true?

## [1] "YES :-)"

3.6.4 Find the cuid and cu_name of all the IndexId/POP_ID in CUSS and NUSEDS

In this section we attempt to find the PSE’s cuid and cu_name_pse to all CUs (FULL_IN_IN and CU_NAME) in CUSS. We then fill the missing information for the fields FULL_CU_IN, cuid and cu_name_pse in NUSEDS.

There are 5 time series without a FULL_CU_IN in NUSEDS and they should be the ones identified above, is that the case?

## [1] "YES :-)"

There are 86 time series and 22 FULL_CU_IN in NUSEDS for which we could not find a cuid and cu_name_pse:

##      IndexId  GFE_ID                        WATERBODY FULL_CU_IN
## 1    CM_1863     256             NORTH THOMPSON RIVER    CM-9004
## 2     CM_383     141                      BAKER CREEK    CM-9004
## 3   CM_46972      55                    RAILWAY CREEK    CM-9004
## 4   CM_47113     109                       RUBY CREEK    CM-9004
## 5   CM_47116     110                     HUNTER CREEK    CM-9004
## 6   CM_47120     111                 SILVERHOPE CREEK    CM-9004
## 7   CM_47124     112                 COQUIHALLA RIVER    CM-9004
## 8   CM_47131     113                     SUCKER CREEK    CM-9004
## 9   CM_47135     114                     KAWKAWA LAKE    CM-9004
## 10  CM_47136     115                     STEVEN CREEK    CM-9004
## 11  CM_47137     116                       MENZ CREEK    CM-9004
## 12  CM_47138     117                       KOPP CREEK    CM-9004
## 13  CM_47142     118                   AMERICAN CREEK    CM-9004
## 14  CM_47143     119                SQUEAH LAKE CREEK    CM-9004
## 15  CM_47146     120                      EMORY CREEK    CM-9004
## 16  CM_47149     122                       YALE CREEK    CM-9004
## 17  CM_47152     123                    SPUZZUM CREEK    CM-9004
## 18  CM_47155     124                   ANDERSON RIVER    CM-9004
## 19  CM_47177     127                 SEVEN MILE CREEK    CM-9004
## 20  CM_47180     128                      TEXAS CREEK    CM-9004
## 21  CM_47197 7990579         SETON AND CAYOOSH CREEKS    CM-9004
## 22    CN_294      14                   ALOUETTE RIVER    CK-9006
## 23  CN_40043    1210                  SHAWNIGAN CREEK    CK-9005
## 24  CN_42809    1584                    PALLANT CREEK    CK-9002
## 25  CN_46042     181                   CHEHALIS RIVER    CK-9006
## 26  CN_46048     183                     WEAVER CREEK    CK-9004
## 27  CN_47219     132                      GATES RIVER    CK-9004
## 28  CN_48009      34                      STAVE RIVER    CK-9006
## 29  CN_48672     760                    CHAPMAN CREEK    CK-9007
## 30  CN_49002     792                       LANG CREEK    CK-9007
## 31  CN_49062     798                   SLIAMMON CREEK    CK-9007
## 32  CN_49873     697                   CAPILANO RIVER    CK-9007
## 33    CN_696      62                 CHILLIWACK RIVER    CK-9008
## 34 PKE_47215     132                      GATES RIVER   PKE-9005
## 35   SX_1054    2477  FRASER RIVER - ABOVE TETE JAUNE     SER-06
## 36   SX_1099     169                       GOAT RIVER     SER-06
## 37   SX_2804    2037              WILLIAMS LAKE RIVER  SEL-03-xx
## 38   SX_3302    3444                      SPROAT LAKE  SEL-13-xx
## 39   SX_3302   11485          SOMASS-SPROAT-GC SYSTEM  SEL-13-xx
## 40   SX_3302    3416               GREAT CENTRAL LAKE  SEL-13-xx
## 41   SX_3389    2573                      HAWKS CREEK  SEL-03-xx
## 42   SX_3443    2466            CARIBOO RIVER - UPPER  SEL-06-18
## 43  SX_41615    1388                  OWOSSITSA CREEK  SEL-13-21
## 44  SX_43980    1872                     WHALEN CREEK  SEL-18-11
## 45  SX_44732    2638                    GLACIER CREEK  SEL-06-05
## 46  SX_45103    2451 SETON CHANNELS - UPPER AND LOWER  SEL-03-xx
## 47  SX_45524     303                     NADINA RIVER  SEL-06-04
## 48  SX_45526     304               TAGETOCHLAIN CREEK  SEL-06-04
## 49  SX_45537     306                   NAHOUNLI CREEK  SEL-03-xx
## 50  SX_45538     307                    SOWCHEA CREEK  SEL-03-xx
## 51  SX_45983     294     WEST ROAD (BLACKWATER) RIVER     SER-05
## 52  SX_45993     295                    NECHAKO RIVER     SER-05
## 53  SX_46038     181                   CHEHALIS RIVER  SEL-03-xx
## 54  SX_46157     211                   THOMPSON RIVER  SEL-03-xx
## 55  SX_46208     217                    DEADMAN RIVER  SEL-03-xx
## 56  SX_46883     290                    CARIBOO RIVER  SEL-03-xx
## 57  SX_46953     176                      SWIFT CREEK     SER-06
## 58  SX_46981      62                 CHILLIWACK RIVER  SEL-03-xx
## 59  SX_47092     104                     MARIA SLOUGH  SEL-03-xx
## 60  SX_47099     106                   WAHLEACH CREEK  SEL-03-xx
## 61  SX_47106     108                     MAHOOD CREEK  SEL-03-xx
## 62  SX_47110     109                       RUBY CREEK  SEL-03-xx
## 63  SX_47117     111                 SILVERHOPE CREEK  SEL-03-xx
## 64  SX_47121     112                 COQUIHALLA RIVER  SEL-03-xx
## 65  SX_47125     113                     SUCKER CREEK  SEL-03-xx
## 66  SX_47139     118                   AMERICAN CREEK  SEL-03-xx
## 67  SX_47144     120                      EMORY CREEK  SEL-03-xx
## 68  SX_47147     122                       YALE CREEK  SEL-03-xx
## 69  SX_47150     123                    SPUZZUM CREEK  SEL-03-xx
## 70  SX_47191 7990579         SETON AND CAYOOSH CREEKS  SEL-03-xx
## 71  SX_47201     131                    CAYOOSH CREEK  SEL-03-xx
## 72  SX_47221     133                     BRIDGE RIVER     SER-05
## 73  SX_47231     134                    YALAKOM RIVER     SER-05
## 74  SX_47236     138                      CHURN CREEK  SEL-03-xx
## 75  SX_47439     159                INDIANPOINT CREEK  SEL-07-02
## 76  SX_49654     910                  PACK LAKE CREEK  SEL-11-11
## 77  SX_49808     684                    LATIMER CREEK  SEL-03-xx
## 78  SX_50840     847                   GLENDALE CREEK  SEL-11-12
## 79  SX_50841     847                   GLENDALE CREEK  SEL-11-12
## 80  SX_51610     937                     AMBACK CREEK  SEL-15-03
## 81  SX_51635     942                    GENESEE CREEK  SEL-15-03
## 82  SX_52820    1132                      CLEAR CREEK  SEL-11-10
## 83   SX_7010   21574             BOWRON RIVER - LOWER  SEL-03-xx
## 84   SX_7645    2455                   BAEZAEKO RIVER     SER-05
## 85   SX_7728     133                     BRIDGE RIVER     SER-05
## 86   SX_7745     158                     HAGGEN CREEK     SER-06
##                                           CU_NAME cu_name_pse cuid
## 1                                 INTERIOR FRASER        <NA>   NA
## 2                                 INTERIOR FRASER        <NA>   NA
## 3                                 INTERIOR FRASER        <NA>   NA
## 4                                 INTERIOR FRASER        <NA>   NA
## 5                                 INTERIOR FRASER        <NA>   NA
## 6                                 INTERIOR FRASER        <NA>   NA
## 7                                 INTERIOR FRASER        <NA>   NA
## 8                                 INTERIOR FRASER        <NA>   NA
## 9                                 INTERIOR FRASER        <NA>   NA
## 10                                INTERIOR FRASER        <NA>   NA
## 11                                INTERIOR FRASER        <NA>   NA
## 12                                INTERIOR FRASER        <NA>   NA
## 13                                INTERIOR FRASER        <NA>   NA
## 14                                INTERIOR FRASER        <NA>   NA
## 15                                INTERIOR FRASER        <NA>   NA
## 16                                INTERIOR FRASER        <NA>   NA
## 17                                INTERIOR FRASER        <NA>   NA
## 18                                INTERIOR FRASER        <NA>   NA
## 19                                INTERIOR FRASER        <NA>   NA
## 20                                INTERIOR FRASER        <NA>   NA
## 21                                INTERIOR FRASER        <NA>   NA
## 22      FRASER-CROSS-CU SUPPLEMENTATION EXCLUSION        <NA>   NA
## 23                            SOUTH-MISCELLANEOUS        <NA>   NA
## 24               HATCHERY EXCLUSION-PALLANT CREEK        <NA>   NA
## 25      FRASER-CROSS-CU SUPPLEMENTATION EXCLUSION        <NA>   NA
## 26                           FRASER-MISCELLANEOUS        <NA>   NA
## 27                           FRASER-MISCELLANEOUS        <NA>   NA
## 28      FRASER-CROSS-CU SUPPLEMENTATION EXCLUSION        <NA>   NA
## 29 SOUTHERN BC-CROSS-CU SUPPLEMENTATION EXCLUSION        <NA>   NA
## 30 SOUTHERN BC-CROSS-CU SUPPLEMENTATION EXCLUSION        <NA>   NA
## 31 SOUTHERN BC-CROSS-CU SUPPLEMENTATION EXCLUSION        <NA>   NA
## 32 SOUTHERN BC-CROSS-CU SUPPLEMENTATION EXCLUSION        <NA>   NA
## 33         FRASER-HARRISON FALL TRANSPLANT_FA_0.3        <NA>   NA
## 34                                   FRASER RIVER        <NA>   NA
## 35                                   UPPER FRASER        <NA>   NA
## 36                                   UPPER FRASER        <NA>   NA
## 37                  FRASER RIVER MIGRATORY COUNTS        <NA>   NA
## 38                           GREAT CENTRAL/SPROAT        <NA>   NA
## 39                           GREAT CENTRAL/SPROAT        <NA>   NA
## 40                           GREAT CENTRAL/SPROAT        <NA>   NA
## 41                  FRASER RIVER MIGRATORY COUNTS        <NA>   NA
## 42                          CARIBOO-SUMMER TIMING        <NA>   NA
## 43                                      OWOSSITSA        <NA>   NA
## 44                                         WHALEN        <NA>   NA
## 45                           FRANCOIS-LATE TIMING        <NA>   NA
## 46                  FRASER RIVER MIGRATORY COUNTS        <NA>   NA
## 47                   FRANCOIS-EARLY SUMMER TIMING        <NA>   NA
## 48                   FRANCOIS-EARLY SUMMER TIMING        <NA>   NA
## 49                  FRASER RIVER MIGRATORY COUNTS        <NA>   NA
## 50                  FRASER RIVER MIGRATORY COUNTS        <NA>   NA
## 51                                  MIDDLE FRASER        <NA>   NA
## 52                                  MIDDLE FRASER        <NA>   NA
## 53                  FRASER RIVER MIGRATORY COUNTS        <NA>   NA
## 54                  FRASER RIVER MIGRATORY COUNTS        <NA>   NA
## 55                  FRASER RIVER MIGRATORY COUNTS        <NA>   NA
## 56                  FRASER RIVER MIGRATORY COUNTS        <NA>   NA
## 57                                   UPPER FRASER        <NA>   NA
## 58                  FRASER RIVER MIGRATORY COUNTS        <NA>   NA
## 59                  FRASER RIVER MIGRATORY COUNTS        <NA>   NA
## 60                  FRASER RIVER MIGRATORY COUNTS        <NA>   NA
## 61                  FRASER RIVER MIGRATORY COUNTS        <NA>   NA
## 62                  FRASER RIVER MIGRATORY COUNTS        <NA>   NA
## 63                  FRASER RIVER MIGRATORY COUNTS        <NA>   NA
## 64                  FRASER RIVER MIGRATORY COUNTS        <NA>   NA
## 65                  FRASER RIVER MIGRATORY COUNTS        <NA>   NA
## 66                  FRASER RIVER MIGRATORY COUNTS        <NA>   NA
## 67                  FRASER RIVER MIGRATORY COUNTS        <NA>   NA
## 68                  FRASER RIVER MIGRATORY COUNTS        <NA>   NA
## 69                  FRASER RIVER MIGRATORY COUNTS        <NA>   NA
## 70                  FRASER RIVER MIGRATORY COUNTS        <NA>   NA
## 71                  FRASER RIVER MIGRATORY COUNTS        <NA>   NA
## 72                                  MIDDLE FRASER        <NA>   NA
## 73                                  MIDDLE FRASER        <NA>   NA
## 74                  FRASER RIVER MIGRATORY COUNTS        <NA>   NA
## 75              INDIAN/KRUGER-EARLY SUMMER TIMING        <NA>   NA
## 76                                           PACK        <NA>   NA
## 77                  FRASER RIVER MIGRATORY COUNTS        <NA>   NA
## 78                                       GLENDALE        <NA>   NA
## 79                                       GLENDALE        <NA>   NA
## 80                            OWIKENO-LATE TIMING        <NA>   NA
## 81                            OWIKENO-LATE TIMING        <NA>   NA
## 82                                    VILLAGE BAY        <NA>   NA
## 83                  FRASER RIVER MIGRATORY COUNTS        <NA>   NA
## 84                                  MIDDLE FRASER        <NA>   NA
## 85                                  MIDDLE FRASER        <NA>   NA
## 86                                   UPPER FRASER        <NA>   NA

These series are kept in NUSEDS.

4 Trubleshoot issues

Normally, POP_ID and GFE_ID should be a ONE-TO-ONE relationship (i.e., a given population should be present in only one location and a CU should have only one population in any given location). By comparing the times series presenting either of these these two issues, we can spot clear errors in the data, errors that we solve by either deleting duplicated data points or merging time series.

4.1 Issue: CUs with multiple time series in a single location

CUs with multiple IndexId/POP_ID associated to a single GFE_ID.

There are many cases where a CU in a given location (i.e., a unique GFE_ID) is associated to multiple series (i.e. IndexId/POP_ID), which should not happen. Observing these cases reveals clear duplicated data points or single data point that are not worth keeping. To fix these issues we proceed as follow:

  • Case 1: one of the duplicated series has only one data point:

    • if it is complementary: merge to the other (longer) series

    • if it is in conflict or a duplicate: remove the focal series

  • Case 2: the shorter series is 100% duplicated: removed the focal series

  • Case 3: for the rest of the duplicated series:

    • points that are conflictual or duplicated are summed up

    • points that are complementary are merged

By summing data points in a few cases, we assume that the different runs (e.g., “Chinook Run 1” and “Chinook Run 2”) can be considered a single population. For example, the Bridge River has Summer and Late run sockeye surveys, but these are both in the MIDDLE FRASER river-type sockeye CU and are summed to yield the Bridge River survey for that CU in the PSE.

The figures below show the time series before and after correction with the action made (i.e., “MERGED”, “DELETED”, “SUMMED”) indicated.

There are a few instances where a special fix is done, as indicated below in the corresponding figures.

In the few instances where data points are summed, we define the ESTIMATE_CLASSIFICATION (e.g., “RELATIVE ABUNDANCE (TYPE-3)”) as the value corresponding to the highest MAX_ESTIMATE value between the two data points. In these cases, a table follows the figure and show the data points that are summed with their corresponding Year and ESTIMATE_CLASSIFICATION.

##    IndexId GFE_ID FULL_CU_IN cuid
## 1   CN_275    220      CK-13  313
## 2 CN_46246    220      CK-13  313

## [1] "Special fix (above): the two red points are deleted; one is a duplicate, the other one has 'UNKNOWN' ESTIMATE_CLASSIFICATION:"
##   GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1  ESTIMATE_CLASSIFICATION_s1 IndexId_s2
## 1    220 2003     CN_275             150                     UNKNOWN   CN_46246
## 2    220 2006     CN_275            6344 RELATIVE ABUNDANCE (TYPE-4)   CN_46246
##   MAX_ESTIMATE_s2  ESTIMATE_CLASSIFICATION_s2
## 1            1920 RELATIVE ABUNDANCE (TYPE-4)
## 2            6344 RELATIVE ABUNDANCE (TYPE-4)
## [1] "***"
##    IndexId GFE_ID FULL_CU_IN cuid
## 1   CM_297     14      CM-02  701
## 2 CM_47925     14      CM-02  701

##   IndexId GFE_ID FULL_CU_IN cuid
## 1  SX_431    252  SEL-09-03  739
## 2 SX_7635    252  SEL-09-03  739

##    IndexId GFE_ID FULL_CU_IN cuid
## 1  CM_1251     67      CM-02  701
## 2 CM_47009     67      CM-02  701

##    IndexId GFE_ID FULL_CU_IN cuid
## 1  CN_1416    312      CK-11  311
## 2 CN_45555    312      CK-11  311

##    IndexId GFE_ID FULL_CU_IN cuid
## 1  CO_1720    241      CO-08  708
## 2 CO_46441    241      CO-08  708

## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
##   GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1 ESTIMATE_CLASSIFICATION_s1 IndexId_s2
## 1    241 1998    CO_1720             309                    UNKNOWN   CO_46441
##   MAX_ESTIMATE_s2 ESTIMATE_CLASSIFICATION_s2
## 1             206                    UNKNOWN
## [1] "***"
##    IndexId GFE_ID FULL_CU_IN cuid
## 1  CO_2182   2476      CO-48  749
## 2 CO_44539   2476      CO-48  749

##    IndexId GFE_ID FULL_CU_IN cuid
## 1  CN_2318    167      CK-12  312
## 2 CN_47519    167      CK-12  312

##    IndexId GFE_ID FULL_CU_IN cuid
## 1  CN_2483    142      CK-10  310
## 2 CN_47277    142      CK-10  310
## 3 CN_47278    142      CK-10  310

## [1] "we can see in the last figure above that there is another time series (CN_47277 - 2464) of the same CU associated to the COTTONWOOD RIVER, which matches the POPULATION of both CN_47277 and CN_47278."
## [1] "**DECISIONS:**"
## [1] "We merge the series CN_47277 - 142 to the series CN_47277 - 2464 by changing its GFE_ID (and associated fields) from 142 to 2464."
## [1] "The one data point of series CN_47278 - 142 is discarded because it is in conflict with series CN_47277 - 2464 (plus ESTIMATE_METHOD = 'Area Under the Curve' vs. 'Peak Live * Expansion')."
## [1] "After corrections:"
## [1] "24 rows were edited in all_areas_nuseds at the following fields: AREA, WATERBODY, GAZETTED_NAME, LOCAL_NAME_1, LOCAL_NAME_2, WATERSHED_CDE, WATERBODY_ID, GFE_ID, FWA_WATERSHED_CDE"

##   IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_3136   2243      CO-22  518
## 2 CO_7776   2243      CO-22  518

##   IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_3306  11486      CK-31  328
## 2 CN_3324  11486      CK-31  328

##   IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_3306  11487      CK-31  328
## 2 CN_3320  11487      CK-31  328

## [1] "The POPULATION of series CN_3306 - 11487 does not align well with WATERBODY (1st plot above)."
## [1] "Additonally, IndexId/POP_ID CN_3306 is associated to multiple GFE_ID (plot above)."
## [1] "**DECISIONS:**"
## [1] "Merge CN_3306 - 11487 to CN_3320 - 11487 in NUSEDS."
## [1] "***"

##    IndexId GFE_ID FULL_CU_IN cuid
## 1  CN_3331   1156      CK-27  325
## 2 CN_53069   1156      CK-27  325

##    IndexId GFE_ID FULL_CU_IN cuid
## 1  CN_3333   1194      CK-83  334
## 2 CN_51518   1194      CK-83  334

## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
##   GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1  ESTIMATE_CLASSIFICATION_s1 IndexId_s2
## 1   1194 2019    CN_3333               3 RELATIVE ABUNDANCE (TYPE-5)   CN_51518
## 2   1194 2020    CN_3333               9 RELATIVE ABUNDANCE (TYPE-5)   CN_51518
## 3   1194 2021    CN_3333               5 RELATIVE ABUNDANCE (TYPE-5)   CN_51518
## 4   1194 2022    CN_3333              16 RELATIVE ABUNDANCE (TYPE-5)   CN_51518
## 5   1194 2023    CN_3333              14 RELATIVE ABUNDANCE (TYPE-5)   CN_51518
##   MAX_ESTIMATE_s2  ESTIMATE_CLASSIFICATION_s2
## 1             267 RELATIVE ABUNDANCE (TYPE-5)
## 2             722 RELATIVE ABUNDANCE (TYPE-4)
## 3             992     TRUE ABUNDANCE (TYPE-2)
## 4             417 RELATIVE ABUNDANCE (TYPE-3)
## 5             590 RELATIVE ABUNDANCE (TYPE-3)
## [1] "***"
##    IndexId GFE_ID FULL_CU_IN cuid
## 1  CN_3334   1194      CK-25  324
## 2 CN_51519   1194      CK-25  324

## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
##    GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1  ESTIMATE_CLASSIFICATION_s1
## 30   1194 1984    CN_3334             824 RELATIVE ABUNDANCE (TYPE-5)
## 32   1194 1989    CN_3334             483 RELATIVE ABUNDANCE (TYPE-5)
## 33   1194 1990    CN_3334             233 RELATIVE ABUNDANCE (TYPE-5)
## 35   1194 1993    CN_3334             638 RELATIVE ABUNDANCE (TYPE-5)
## 36   1194 1994    CN_3334             437 RELATIVE ABUNDANCE (TYPE-5)
##    IndexId_s2 MAX_ESTIMATE_s2  ESTIMATE_CLASSIFICATION_s2
## 30   CN_51519            2423 RELATIVE ABUNDANCE (TYPE-5)
## 32   CN_51519            2550 RELATIVE ABUNDANCE (TYPE-5)
## 33   CN_51519            1720 RELATIVE ABUNDANCE (TYPE-5)
## 35   CN_51519             740 RELATIVE ABUNDANCE (TYPE-5)
## 36   CN_51519             940 RELATIVE ABUNDANCE (TYPE-5)
## [1] "***"
## [1] "note that the POPULATION of CN_3334 (above) shows 'unkown timing', however the field END_SPAWN_DT_FROM shows that the end of the spawning season is in the FALL, which corresonds to the correct CU:"
##    IndexId FULL_CU_IN GFE_ID     WATERBODY END_SPAWN_DT_FROM
## 1  CN_3334              1194 NANAIMO RIVER         14-OCT-54
## 2  CN_3334              1194 NANAIMO RIVER         15-NOV-64
## 3  CN_3334              1194 NANAIMO RIVER         30-OCT-55
## 4  CN_3334              1194 NANAIMO RIVER         15-SEP-59
## 5  CN_3334              1194 NANAIMO RIVER         28-OCT-72
## 6  CN_3334              1194 NANAIMO RIVER         15-NOV-63
## 7  CN_3334              1194 NANAIMO RIVER         15-NOV-67
## 8  CN_3334              1194 NANAIMO RIVER         01-NOV-89
## 9  CN_3334              1194 NANAIMO RIVER                  
## 10 CN_3334              1194 NANAIMO RIVER         20-OCT-71
## 11 CN_3334              1194 NANAIMO RIVER         15-DEC-61
## 12 CN_3334              1194 NANAIMO RIVER         10-OCT-58
## 13 CN_3334              1194 NANAIMO RIVER         15-NOV-60
## 14 CN_3334              1194 NANAIMO RIVER         30-NOV-78
## 15 CN_3334              1194 NANAIMO RIVER         07-OCT-69
## 16 CN_3334              1194 NANAIMO RIVER         16-NOV-75
## 17 CN_3334              1194 NANAIMO RIVER         15-DEC-68
## 18 CN_3334              1194 NANAIMO RIVER         15-OCT-53
## 19 CN_3334              1194 NANAIMO RIVER         15-NOV-66
## 20 CN_3334              1194 NANAIMO RIVER         15-DEC-62
## 21 CN_3334              1194 NANAIMO RIVER         15-OCT-70
## 22 CN_3334              1194 NANAIMO RIVER         10-NOV-57
## 23 CN_3334              1194 NANAIMO RIVER         10-NOV-73
## 24 CN_3334              1194 NANAIMO RIVER         15-NOV-74
## 25 CN_3334              1194 NANAIMO RIVER         15-NOV-65
## 26 CN_3334              1194 NANAIMO RIVER         01-NOV-76
## 27 CN_3334              1194 NANAIMO RIVER         30-SEP-56
## 28 CN_3334              1194 NANAIMO RIVER         22-NOV-77
## [1] "***"
##    IndexId GFE_ID FULL_CU_IN cuid
## 1  SX_7728    133     SER-05   NA
## 2 SX_47221    133     SER-05   NA

## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
##   GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1  ESTIMATE_CLASSIFICATION_s1 IndexId_s2
## 1    133 2013    SX_7728              25 RELATIVE ABUNDANCE (TYPE-4)   SX_47221
## 2    133 2014    SX_7728              68 RELATIVE ABUNDANCE (TYPE-3)   SX_47221
## 3    133 2016    SX_7728               0 RELATIVE ABUNDANCE (TYPE-4)   SX_47221
## 4    133 2017    SX_7728               0 RELATIVE ABUNDANCE (TYPE-3)   SX_47221
## 5    133 2018    SX_7728               0 RELATIVE ABUNDANCE (TYPE-3)   SX_47221
## 6    133 2019    SX_7728               4 RELATIVE ABUNDANCE (TYPE-4)   SX_47221
## 7    133 2022    SX_7728               0 RELATIVE ABUNDANCE (TYPE-4)   SX_47221
##   MAX_ESTIMATE_s2  ESTIMATE_CLASSIFICATION_s2
## 1            2131 RELATIVE ABUNDANCE (TYPE-4)
## 2             502 RELATIVE ABUNDANCE (TYPE-3)
## 3              61 RELATIVE ABUNDANCE (TYPE-3)
## 4             158 RELATIVE ABUNDANCE (TYPE-3)
## 5              88 RELATIVE ABUNDANCE (TYPE-3)
## 6             121 RELATIVE ABUNDANCE (TYPE-4)
## 7            4844 RELATIVE ABUNDANCE (TYPE-4)
## [1] "***"
##    IndexId GFE_ID FULL_CU_IN cuid
## 1  CN_7809    442      CK-01  301
## 2 CN_48442    442      CK-01  301

##    IndexId GFE_ID FULL_CU_IN cuid
## 1  SX_7823    298  SEL-06-06  753
## 2 SX_46023    298  SEL-06-06  753

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_39896   1449      CM-11  977
## 2 CM_39897   1449      CM-11  977

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_40023   1208      CK-22  322
## 2 CN_40024   1208      CK-22  322

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_40548   1261      CK-31  328
## 2 CN_40549   1261      CK-31  328

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_41342   1358      CO-17  914
## 2 CO_41343   1358      CO-17  914

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_42331   1482      CM-11  977
## 2 CM_42332   1482      CM-11  977

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_44539    129      CO-48  749
## 2 CO_47183    129      CO-48  749

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_44539   2451      CO-48  749
## 2 CO_47183   2451      CO-48  749

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_44736   2453      CO-07  707
## 2 CO_46170   2453      CO-07  707

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_44772   2464      CK-10  310
## 2 CN_47277   2464      CK-10  310

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_45016   2392      CO-08  708
## 2 CO_45017   2392      CO-08  708

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_45164   2224      CK-63 1006
## 2 CN_45165   2224      CK-63 1006

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 SX_45182   1327     SER-10  972
## 2 SX_52310   1327     SER-10  972

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_46186    214      CK-17  317
## 2 CN_46187    214      CK-17  317

## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
##   GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1           ESTIMATE_CLASSIFICATION_s1
## 1    214 1999   CN_46187             432 RELATIVE: VARYING MULTI-YEAR METHODS
##   IndexId_s2 MAX_ESTIMATE_s2  ESTIMATE_CLASSIFICATION_s2
## 1   CN_46186             114 RELATIVE ABUNDANCE (TYPE-4)
## [1] "***"
##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_46196    215      CK-17  317
## 2 CN_46197    215      CK-17  317

## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
##   GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1           ESTIMATE_CLASSIFICATION_s1
## 1    215 1999   CN_46197             237 RELATIVE: VARYING MULTI-YEAR METHODS
##   IndexId_s2 MAX_ESTIMATE_s2  ESTIMATE_CLASSIFICATION_s2
## 1   CN_46196             208 RELATIVE ABUNDANCE (TYPE-4)
## [1] "***"
##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_46240    221      CO-08  708
## 2 CO_46250    221      CO-08  708

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_46602   2746      CO-09  709
## 2 CO_46632   2746      CO-09  709

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_46961    176      CK-12  312
## 2 CN_46962    176      CK-12  312

## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
##   GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1            ESTIMATE_CLASSIFICATION_s1
## 1    176 1998   CN_46962            2141 RELATIVE: CONSTANT MULTI-YEAR METHODS
## 2    176 2008   CN_46962             421           RELATIVE ABUNDANCE (TYPE-4)
##   IndexId_s2 MAX_ESTIMATE_s2  ESTIMATE_CLASSIFICATION_s2
## 1   CN_46961            1098 RELATIVE ABUNDANCE (TYPE-4)
## 2   CN_46961             422 RELATIVE ABUNDANCE (TYPE-4)
## [1] "***"
##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_47925     15      CM-02  701
## 2 CM_47928     15      CM-02  701

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_47925     17      CM-02  701
## 2 CM_47939     17      CM-02  701

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_49080    800      CM-04  904
## 2 CM_49081    800      CM-04  904

## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
##   GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1  ESTIMATE_CLASSIFICATION_s1 IndexId_s2
## 1    800 2003   CM_49081            1877 RELATIVE ABUNDANCE (TYPE-4)   CM_49080
##   MAX_ESTIMATE_s2  ESTIMATE_CLASSIFICATION_s2
## 1            2603 RELATIVE ABUNDANCE (TYPE-5)
## [1] "***"
##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_49516    896      CO-12  909
## 2 CO_49517    896      CO-12  909

## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
##   GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1 ESTIMATE_CLASSIFICATION_s1 IndexId_s2
## 1    896 1995   CO_49517             200                    UNKNOWN   CO_49516
## 2    896 1996   CO_49517             100                    UNKNOWN   CO_49516
##   MAX_ESTIMATE_s2 ESTIMATE_CLASSIFICATION_s2
## 1             400                    UNKNOWN
## 2             800                    UNKNOWN
## [1] "***"
##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_49520    896      CM-08  900
## 2 CM_49521    896      CM-08  900

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_49526    897      CO-12  909
## 2 CO_49527    897      CO-12  909

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_49546    899      CO-12  909
## 2 CO_49547    899      CO-12  909

## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
##   GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1 ESTIMATE_CLASSIFICATION_s1 IndexId_s2
## 1    899 1994   CO_49547             600                    UNKNOWN   CO_49546
## 2    899 1996   CO_49547             350                    UNKNOWN   CO_49546
## 3    899 1997   CO_49547              50                    UNKNOWN   CO_49546
##   MAX_ESTIMATE_s2 ESTIMATE_CLASSIFICATION_s2
## 1             300                    UNKNOWN
## 2             200                    UNKNOWN
## 3             100                    UNKNOWN
## [1] "***"
##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_49640    908      CM-08  900
## 2 CM_49641    908      CM-08  900

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_49896    704      CO-12  909
## 2 CO_49897    704      CO-12  909

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_49935    710      CO-10  907
## 2 CO_49936    710      CO-10  907

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_50536    816      CM-07  901
## 2 CM_50537    816      CM-07  901

## [1] "Special fix (above): the blue points before 1950 are deleted, the other ones are summed to the red series."
## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
##    GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1  ESTIMATE_CLASSIFICATION_s1
## 1     816 1934   CM_50537           75000                     UNKNOWN
## 2     816 1935   CM_50537            7500                     UNKNOWN
## 3     816 1936   CM_50537           15000                     UNKNOWN
## 4     816 1937   CM_50537           75000                     UNKNOWN
## 5     816 1939   CM_50537            7500                     UNKNOWN
## 6     816 1940   CM_50537           35000                     UNKNOWN
## 7     816 1942   CM_50537          100000                     UNKNOWN
## 8     816 1943   CM_50537           75000                     UNKNOWN
## 9     816 1944   CM_50537           35000                     UNKNOWN
## 10    816 1995   CM_50537            3000 RELATIVE ABUNDANCE (TYPE-5)
## 11    816 1996   CM_50537            2200 RELATIVE ABUNDANCE (TYPE-5)
## 12    816 1997   CM_50537            1000 RELATIVE ABUNDANCE (TYPE-5)
## 13    816 1998   CM_50537            1000 RELATIVE ABUNDANCE (TYPE-5)
## 14    816 2000   CM_50537             200 RELATIVE ABUNDANCE (TYPE-5)
## 15    816 2002   CM_50537            2012   PRESENCE-ABSENCE (TYPE-6)
## 16    816 2003   CM_50537             215 RELATIVE ABUNDANCE (TYPE-5)
## 17    816 2005   CM_50537             107 RELATIVE ABUNDANCE (TYPE-5)
##         ESTIMATE_METHOD_s1 IndexId_s2 MAX_ESTIMATE_s2
## 1  Unknown Estimate Method   CM_50536           75000
## 2  Unknown Estimate Method   CM_50536            7500
## 3  Unknown Estimate Method   CM_50536           15000
## 4  Unknown Estimate Method   CM_50536           75000
## 5  Unknown Estimate Method   CM_50536            7500
## 6  Unknown Estimate Method   CM_50536           35000
## 7  Unknown Estimate Method   CM_50536          100000
## 8  Unknown Estimate Method   CM_50536           75000
## 9  Unknown Estimate Method   CM_50536           35000
## 10          Expert Opinion   CM_50536           27000
## 11          Expert Opinion   CM_50536            2300
## 12          Expert Opinion   CM_50536            6000
## 13          Expert Opinion   CM_50536            4000
## 14          Expert Opinion   CM_50536            6800
## 15        Peak Live + Dead   CM_50536           20567
## 16        Peak Live + Dead   CM_50536           71023
## 17        Peak Live + Dead   CM_50536            2354
##     ESTIMATE_CLASSIFICATION_s2      ESTIMATE_METHOD_s2
## 1                      UNKNOWN Unknown Estimate Method
## 2                      UNKNOWN Unknown Estimate Method
## 3                      UNKNOWN Unknown Estimate Method
## 4                      UNKNOWN Unknown Estimate Method
## 5                      UNKNOWN Unknown Estimate Method
## 6                      UNKNOWN Unknown Estimate Method
## 7                      UNKNOWN Unknown Estimate Method
## 8                      UNKNOWN Unknown Estimate Method
## 9                      UNKNOWN Unknown Estimate Method
## 10 RELATIVE ABUNDANCE (TYPE-5)          Expert Opinion
## 11 RELATIVE ABUNDANCE (TYPE-5)          Expert Opinion
## 12 RELATIVE ABUNDANCE (TYPE-5)          Expert Opinion
## 13 RELATIVE ABUNDANCE (TYPE-5)          Expert Opinion
## 14 RELATIVE ABUNDANCE (TYPE-5)          Expert Opinion
## 15   PRESENCE-ABSENCE (TYPE-6)        Peak Live + Dead
## 16 RELATIVE ABUNDANCE (TYPE-3)    Area Under the Curve
## 17 RELATIVE ABUNDANCE (TYPE-3)    Area Under the Curve
## [1] "***"
##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_50546    817      CM-07  901
## 2 CM_50547    817      CM-07  901

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 SX_50610    824  SEL-11-06  932
## 2 SX_50611    824  SEL-11-06  932

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_50618    824      CK-28  326
## 2 CN_50619    824      CK-28  326

## [1] "Special fix (above): the two blue points are deleted; one is a duplicate, the other one is very close to be one."
##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_50712    834      CO-12  909
## 2 CO_50713    834      CO-12  909

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_50736    836      CM-06  902
## 2 CM_50737    836      CM-06  902

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_50832    846      CO-12  909
## 2 CO_50833    846      CO-12  909

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 SX_50840    847  SEL-11-12   NA
## 2 SX_50841    847  SEL-11-12   NA

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_50872    850      CO-19  916
## 2 CO_50873    850      CO-19  916

## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
##   GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1 ESTIMATE_CLASSIFICATION_s1 IndexId_s2
## 1    850 1992   CO_50873             100                    UNKNOWN   CO_50872
## 2    850 1993   CO_50873              60                    UNKNOWN   CO_50872
##   MAX_ESTIMATE_s2 ESTIMATE_CLASSIFICATION_s2
## 1             250                    UNKNOWN
## 2             211                    UNKNOWN
## [1] "***"
##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_50878    850      CK-35  332
## 2 CN_50879    850      CK-35  332

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_50892    852      CO-12  909
## 2 CO_50893    852      CO-12  909

## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
##   GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1 ESTIMATE_CLASSIFICATION_s1 IndexId_s2
## 1    852 1993   CO_50893            1370                    UNKNOWN   CO_50892
## 2    852 1995   CO_50893             420                    UNKNOWN   CO_50892
##   MAX_ESTIMATE_s2 ESTIMATE_CLASSIFICATION_s2
## 1              28                    UNKNOWN
## 2             220                    UNKNOWN
## [1] "***"
##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_50902    853      CO-12  909
## 2 CO_50903    853      CO-12  909

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_50956    858      CM-08  900
## 2 CM_50957    858      CM-08  900

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_50972    860      CO-12  909
## 2 CO_50973    860      CO-12  909

## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
##   GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1 ESTIMATE_CLASSIFICATION_s1 IndexId_s2
## 2    860 1995   CO_50973             150                    UNKNOWN   CO_50972
##   MAX_ESTIMATE_s2 ESTIMATE_CLASSIFICATION_s2
## 2             392                    UNKNOWN
## [1] "***"
##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_50976    860      CM-08  900
## 2 CM_50977    860      CM-08  900

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_50986    861      CM-08  900
## 2 CM_50987    861      CM-08  900

## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
##   GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1 ESTIMATE_CLASSIFICATION_s1 IndexId_s2
## 1    861 1995   CM_50987            1000                    UNKNOWN   CM_50986
## 2    861 1996   CM_50987             350                    UNKNOWN   CM_50986
##   MAX_ESTIMATE_s2 ESTIMATE_CLASSIFICATION_s2
## 1             800                    UNKNOWN
## 2             450                    UNKNOWN
## [1] "***"
##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_51002    863      CO-12  909
## 2 CO_51003    863      CO-12  909

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_51036    866      CM-08  900
## 2 CM_51037    866      CM-08  900

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_51042    867      CO-12  909
## 2 CO_51043    867      CO-12  909

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_51092    872      CO-12  909
## 2 CO_51093    872      CO-12  909

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_51096    872      CM-08  900
## 2 CM_51097    872      CM-08  900

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_51112    874      CO-12  909
## 2 CO_51113    874      CO-12  909

## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
##   GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1 ESTIMATE_CLASSIFICATION_s1 IndexId_s2
## 1    874 1997   CO_51113             100                    UNKNOWN   CO_51112
## 2    874 1998   CO_51113            9500                    UNKNOWN   CO_51112
##   MAX_ESTIMATE_s2 ESTIMATE_CLASSIFICATION_s2
## 1              25                    UNKNOWN
## 2            3700                    UNKNOWN
## [1] "***"
##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_51147    878      CO-12  909
## 2 CO_51148    878      CO-12  909

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_52512   1100      CO-15  912
## 2 CO_52513   1100      CO-15  912

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_52522   1101      CO-15  912
## 2 CO_52523   1101      CO-15  912

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_52532   1102      CO-15  912
## 2 CO_52533   1102      CO-15  912

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 SX_52625   1112  SEL-12-03  939
## 2 SX_52626   1112  SEL-12-03  939

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_52627   1112      CO-14  911
## 2 CO_52628   1112      CO-14  911

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_52661   1115      CM-05  903
## 2 CM_52662   1115      CM-05  903

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_52672   1117      CO-14  911
## 2 CO_52673   1117      CO-14  911

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_52736   1123      CM-05  903
## 2 CM_52737   1123      CM-05  903

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_52792   1129      CO-11  908
## 2 CO_52793   1129      CO-11  908

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_52802   1130      CO-11  908
## 2 CO_52803   1130      CO-11  908

##    IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_52866   1136      CM-06  902
## 2 CM_52867   1136      CM-06  902

4.2 Issue: A same population found in mutliple locations

We now look at the case where a single IndexId/POP_ID is associated to multiple GFE_ID and make decision on a case by case basis.

## [1] "*** i = 1 ***"
##         IndexId GFE_ID FULL_CU_IN cuid
## 945    CM_47925     14      CM-02  701
## 154590 CM_47925  31516      CM-02  701
## 154598 CM_47925     15      CM-02  701
## 154606 CM_47925  31740      CM-02  701

## [1] "In case i = 1 we leave the data as is."
## [1] "*** i = 2 ***"
##      IndexId GFE_ID FULL_CU_IN cuid
## 1410 CN_2178   2476      CK-11  311
## 1460 CN_2178    129      CK-11  311

## [1] "In case i = 2 above, the series CN_2178 - 129 is a duplicate of the series CN_47189 - 129 as we can see below:"

## [1] "The series is consequently removed from NUSEDS."
## [1] "Number of rows removed from dataframe = 5"

## [1] "*** i = 3 ***"
##      IndexId GFE_ID FULL_CU_IN cuid
## 3720 SX_3302   3444  SEL-13-xx   NA
## 3743 SX_3302  11485  SEL-13-xx   NA
## 3774 SX_3302   3416  SEL-13-xx   NA

## [1] "In case i = 3 we see that there are multiple time series that could potentially be merged (see both above and below); we however do not make any change."

## [1] "*** i = 4 ***"
##      IndexId GFE_ID FULL_CU_IN cuid
## 3797 CO_3303  11488      CO-17  914
## 3819 CO_3303  11487      CO-17  914
## 3841 CO_3303  11485      CO-17  914

## [1] "In case i = 4 we leave the data as is."
## [1] "*** i = 5 ***"
##      IndexId GFE_ID FULL_CU_IN cuid
## 3872 CM_3305  11486      CM-10  978
## 3893 CM_3305  11485      CM-10  978
## 3925 CM_3305  11487      CM-10  978
## 3947 CM_3305  11488      CM-10  978

## [1] "In case i = 5 we leave the data as is."
## [1] "*** i = 6 ***"
##      IndexId GFE_ID FULL_CU_IN cuid
## 3969 CN_3306  11486      CK-31  328
## 4007 CN_3306  11485      CK-31  328
## 4057 CN_3306  11488      CK-31  328

## [1] "In case i = 6 we leave the data as is."
## [1] "*** i = 7 ***"
##      IndexId GFE_ID FULL_CU_IN cuid
## 5280 SX_3416   2746  SEL-10-03  752
## 5345 SX_3416    261  SEL-10-03  752

## [1] "In case i = 7 we merge series SX_3416 - 2746 to series SX_3416 - 261 because the locations have the same coordinates:"
##                      CENSUS_SITE GFE_ID      X_LONGT      Y_LAT
## 1                  FENNELL CREEK    261 -119.7232685 51.3525769
## 2 FENNELL CREEK AND SASKUM CREEK   2746 -119.7232685 51.3525769
## [1] "Note that no data point are in conflict."
## [1] "65 rows were edited in all_areas_nuseds at the following fields: AREA, WATERBODY, GAZETTED_NAME, LOCAL_NAME_1, LOCAL_NAME_2, WATERSHED_CDE, WATERBODY_ID, GFE_ID, FWA_WATERSHED_CDE"

## [1] "*** i = 8 ***"
##         IndexId GFE_ID FULL_CU_IN cuid
## 111158 CO_44539   2451      CO-48  749
## 111173 CO_44539   2476      CO-48  749

## [1] "In case i = 8 we leave the data as is."
## [1] "*** i = 9 ***"
##         IndexId GFE_ID FULL_CU_IN cuid
## 111403 CN_44577   2492      CK-76 1207
## 111405 CN_44577   2518      CK-76 1207

## [1] "In case i = 9 there is a clear discrepancy between the series CN_44577 - 2518's POPULATION (Chandindu River) and it's WATERBODY (Wolf River) as the two rivers are very far apart."
## [1] "But it seems that the data point in fact belongs to the series CN_44599 - 2518:"

## [1] "We decide to merge the data point to CN_44599 - 2518:"
## [1] "1 rows were edited in all_areas_nuseds at the following fields: SPECIES, POPULATION, RUN_TYPE, POP_ID, FULL_CU_IN, species_acronym_ncc, IndexId, CU_NAME, cu_name_pse, cuid"

## [1] "*** i = 10 ***"
##         IndexId GFE_ID FULL_CU_IN cuid
## 116881 SX_45141   2193     SER-22 1023
## 116916 SX_45141   2119     SER-22 1023

## [1] "In case i = 10 we leave the data as is."
## [1] "*** i = 11 ***"
##         IndexId GFE_ID FULL_CU_IN cuid
## 119811 SX_45525   2444  SEL-06-20  727
## 119863 SX_45525    303  SEL-06-20  727

## [1] "In case i = 11 we leave the data as is."
## [1] "*** i = 12 ***"
##         IndexId GFE_ID FULL_CU_IN cuid
## 129320 CO_46170  54283      CO-07  707
## 129329 CO_46170  54282      CO-07  707
## 129338 CO_46170   2461      CO-07  707
## 129345 CO_46170    213      CO-07  707

## [1] "In case i = 12 we leave the data as is."
## [1] "*** i = 13 ***"
##         IndexId GFE_ID FULL_CU_IN cuid
## 130308 CO_46240    220      CO-08  708
## 130314 CO_46240   3477      CO-08  708

## [1] "In case i = 13 we leave the data as is."
## [1] "*** i = 14 ***"
##         IndexId     GFE_ID FULL_CU_IN cuid
## 134174 CO_46582      33045      CO-09  709
## 134198 CO_46582 1754239945      CO-09  709
## 134216 CO_46582 2049268617      CO-09  709
## 134239 CO_46582        256      CO-09  709
## 134265 CO_46582  719256264      CO-09  709
## 134289 CO_46582      57041      CO-09  709
## 134296 CO_46582      63379      CO-09  709

## [1] "In case i = 14 we leave the data as is."
## [1] "*** i = 15 ***"
##         IndexId    GFE_ID FULL_CU_IN cuid
## 134577 CO_46602       258      CO-09  709
## 134611 CO_46602 212716981      CO-09  709

## [1] "In case i = 15 we leave the data as is."
## [1] "*** i = 16 ***"
##         IndexId GFE_ID FULL_CU_IN cuid
## 134634 CO_46632   2746      CO-09  709
## 134831 CO_46632    261      CO-09  709

## [1] "In case i = 16 we merge series CO_46632 - 2746 to series CO_46632 - 261 because the locations have the same coordinates:"
##                      CENSUS_SITE GFE_ID      X_LONGT      Y_LAT
## 1                  FENNELL CREEK    261 -119.7232685 51.3525769
## 2 FENNELL CREEK AND SASKUM CREEK   2746 -119.7232685 51.3525769
## [1] "Note that no data point are in conflict."
## [1] "36 rows were edited in all_areas_nuseds at the following fields: AREA, WATERBODY, GAZETTED_NAME, LOCAL_NAME_1, LOCAL_NAME_2, WATERSHED_CDE, WATERBODY_ID, GFE_ID, FWA_WATERSHED_CDE"

## [1] "*** i = 17 ***"
##         IndexId     GFE_ID FULL_CU_IN cuid
## 136162 CO_46795        281      CO-09  709
## 136194 CO_46795      33103      CO-09  709
## 136210 CO_46795 1921661712      CO-09  709

## [1] "In case i = 17 we leave the data as is."
## [1] "*** i = 18 ***"
##         IndexId GFE_ID FULL_CU_IN cuid
## 136226 CN_46801    281      CK-18  318
## 136276 CN_46801  33103      CK-18  318

## [1] "In case i = 18 we delete the data point of series CN_46801 - 33103 because it is almost a duplicate:"
##    IndexId GFE_ID          WATERBODY Year MAX_ESTIMATE
## 1 CN_46801    281         BLUE RIVER 2006          212
## 2 CN_46801  33103 BLUE RIVER - LOWER 2006          214

## [1] "*** i = 19 ***"
##         IndexId GFE_ID FULL_CU_IN cuid
## 136410 CO_46835    285      CO-48  749
## 136415 CO_46835   2463      CO-48  749

## [1] "In case i = 19 we merge the data point of series CO_46835 - 285 to series CO_46835 - 2463. Note that the two locations have the same coordinates:"
##               CENSUS_SITE GFE_ID       Y_LAT      X_LONGT
## 1 CHILCOTIN RIVER - LOWER   2463 51.73995282 -122.4013093
## 2         CHILCOTIN RIVER    285 51.73995282 -122.4013093
## [1] "5 rows were edited in all_areas_nuseds at the following fields: AREA, WATERBODY, GAZETTED_NAME, LOCAL_NAME_1, LOCAL_NAME_2, WATERSHED_CDE, WATERBODY_ID, GFE_ID, FWA_WATERSHED_CDE"

## [1] "*** i = 20 ***"
##         IndexId GFE_ID FULL_CU_IN cuid
## 136419 CN_46841   2462      CK-10  310
## 136449 CN_46841   2454      CK-10  310
## 136455 CN_46841    285      CK-10  310

## [1] "In case i = 20 we leave the data as is."
## [1] "*** i = 21 ***"
##         IndexId GFE_ID FULL_CU_IN cuid
## 137022 CN_46891    290      CK-11  311
## 137072 CN_46891   2466      CK-11  311
## 137077 CN_46891   2467      CK-11  311

## [1] "In case i = 21 we mcan see that series CN_46891 - 2467 is a duplicate of series CN_46891 - 290, it is consequently deleted."
## [1] "We leave series CN_46891 - 2466."
## [1] "Number of rows removed from dataframe = 5"

## [1] "*** i = 22 ***"
##         IndexId GFE_ID FULL_CU_IN cuid
## 137082 CN_46892    290      CK-10  310
## 137091 CN_46892   2466      CK-10  310

## [1] "In case i = 22 series CN_46892 - 290 has three duplucated data points with series CN_46892 - 2466."
## [1] "In addition, one data point of the series seem to be a duplicate of the series CN_46891 - 290:"

## [1] "DECISION: we merge CN_46892 - 290 to series CN_46892 - 2466 after deleting the data points that are duplicated in both series CN_46892 - 2466 and CN_46891 - 290."

## [1] "*** i = 23 ***"
##         IndexId GFE_ID FULL_CU_IN cuid
## 138440 CM_46984     62      CM-02  701
## 138487 CM_46984   2529      CM-02  701

## [1] "In case i = 23 we leave the data as is."
## [1] "*** i = 24 ***"
##         IndexId GFE_ID FULL_CU_IN cuid
## 143978 SX_47419    157  SEL-07-01  735
## 144065 SX_47419  21574  SEL-07-01  735

## [1] "In case i = 24 we leave the data as is."
## [1] "*** i = 25 ***"
##         IndexId GFE_ID FULL_CU_IN cuid
## 153806 CM_47906      3      CM-02  701
## 153838 CM_47906    441      CM-02  701

## [1] "In case i = 25 we leave the data as is."
## [1] "*** i = 26 ***"
##         IndexId GFE_ID FULL_CU_IN cuid
## 153840 CM_47907      7      CM-02  701
## 153882 CM_47907      5      CM-02  701

## [1] "In case i = 26 we leave the data as is."
## [1] "*** i = 27 ***"
##         IndexId GFE_ID FULL_CU_IN cuid
## 155166 SX_47954     21     SER-02  742
## 155241 SX_47954  54281     SER-02  742

## [1] "In case i = 27 we leave the data as is."
## [1] "*** i = 28 ***"
##         IndexId GFE_ID FULL_CU_IN cuid
## 166741 CN_48448    443      CK-48  210
## 166785 CN_48448  19723      CK-48  210

## [1] "In case i = 28 we leave the data as is."
## [1] "*** i = 29 ***"
##         IndexId GFE_ID FULL_CU_IN cuid
## 207392 SX_50610    824  SEL-11-06  932
## 207475 SX_50610   2618  SEL-11-06  932

## [1] "In case i = 29 we leave the data as is."
## [1] "*** i = 30 ***"
##         IndexId GFE_ID FULL_CU_IN cuid
## 207498 CO_50612    824      CO-12  909
## 207582 CO_50612   2618      CO-12  909

## [1] "In case i = 30 we leave the data as is."
## [1] "*** i = 31 ***"
##         IndexId GFE_ID FULL_CU_IN cuid
## 207597 CM_50616    824      CM-06  902
## 207682 CM_50616   2618      CM-06  902

## [1] "In case i = 31 we leave the data as is."
## [1] "*** i = 32 ***"
##         IndexId GFE_ID FULL_CU_IN cuid
## 207697 CN_50618    824      CK-28  326
## 207782 CN_50618   2618      CK-28  326

## [1] "In case i = 32 we leave the data as is."
## [1] "*** i = 33 ***"
##         IndexId GFE_ID FULL_CU_IN cuid
## 255633 PKE_3304  11485     PKE-02  918
## 255646 PKE_3304  11486     PKE-02  918

## [1] "In case i = 33 we leave the data as is."
## [1] "*** i = 34 ***"
##          IndexId GFE_ID FULL_CU_IN cuid
## 279914 PKE_50614   2618     PKE-04  920
## 279921 PKE_50614    824     PKE-04  920

## [1] "In case i = 34 we leave the data as is."
## [1] "*** i = 35 ***"
##         IndexId GFE_ID FULL_CU_IN cuid
## 287258 PKO_3304  11485     PKO-06  925
## 287271 PKO_3304  11486     PKO-06  925

## [1] "In case i = 35 we leave the data as is."
## [1] "*** i = 36 ***"
##          IndexId GFE_ID FULL_CU_IN cuid
## 309173 PKO_50614    824     PKO-07  926
## 309215 PKO_50614   2618     PKO-07  926

## [1] "In case i = 36 we leave the data as is."
## [1] "*** i = 37 ***"
##          IndexId    GFE_ID FULL_CU_IN cuid
## 310413 PKO_51094       872     PKO-08  610
## 310456 PKO_51094 489440637     PKO-08  610

## [1] "In case i = 37 we merge the series PKO_51094 - 489440637 to series PKO_51094 - 872 because we can assume that the two locations are in fact the same according to the WATERBODY field and the fact that no coordinates were availale for 'Atlatzi+Clear' Rivers (we defined the them manually higher in the script)."
## [1] "1 rows were edited in all_areas_nuseds at the following fields: AREA, WATERBODY, GAZETTED_NAME, LOCAL_NAME_1, LOCAL_NAME_2, WATERSHED_CDE, WATERBODY_ID, GFE_ID, FWA_WATERSHED_CDE"

4.3 Extra corrections for TBR

PSF formed a technical working group (TWG) specifically to compile data for the Transboundary region (cf. transboundary-data). As part of the work the following modifications were requested:

  • POP_ID = 45144: Revise name of sockeye survey from Tahltan River to Tahltan Lake (Jason Calvert, pers. comm. January 9, 2025)

  • TATSAMENIE RIVER coho (POP_ID = 45152) for 1994 and earlier changed to TATSATUA RIVER (POP_ID = 45154) and remaining records 1995+ are removed

  • any records of POP_ID = 45151 (Tatsamenie River lake-type sockeye) for 1994 or earlier get changed to POP_ID = 45153 (TATSATUA RIVER river-type sockeye)

  • one record with POP_ID = 45165 (Chinook Run 2) change to 45164 (Chinook Run 1) in Nahlin river

We execute and document these changes below.

  • POP_ID = 45144: Revise name of sockeye survey from Tahltan River to Tahltan Lake (Jason Calvert, pers. comm. January 9, 2025)
##    IndexId cuid                     POPULATION    WATERBODY
## 1 SX_45144 1026 Tahltan Lake (Stikine) Sockeye TAHLTAN LAKE

The name was already changed in NuSEDS so nothing to do.

  • TATSAMENIE RIVER coho (POP_ID = 45152) for 1994 and earlier changed to TATSATUA RIVER (POP_ID = 45154) and remaining records 1995+ are removed
##        cu_name_pse cuid  IndexId                         POPULATION
## 1 Taku-Late Timing 1015 CO_45152 Tatsamenie River (Alsek/Taku) Coho
## 2 Taku-Late Timing 1015 CO_45154   Tatsatua River (Alsek/Taku) Coho
##        WATERBODY GFE_ID
## 1 TATSATUA CREEK   2211
## 2 TATSATUA CREEK   2212

ISSUE: as we can see above, POP_ID 45154 is already associated to another GFE_ID = 2212, which corresponds to a location with the same WATERBODY but different coordinates:

##   GFE_ID    CENSUS_SITE      X_LONGT       Y_LAT
## 1   2212 TATSATUA CREEK -132.1429556 58.53473781
## 2   2211 TATSATUA CREEK -132.3948794  58.4400493

GFE_ID = 2212 is in the Tatsatua Lake whereas GFE_ID = 2211 is at the mouth of the river.

DECISION

  • For year <= 1994: replace POP_ID = 45152 by 45154 BUT keep GFE_ID = 2211 (note that POP_ID = 45151 is then associated to two GFE_ID, which is not ideal, but at least we preserve the time series in blue above)

  • For year > 1994: delete

Any we have to add the reference combing POP_ID = 45154 and GFE_ID = 2211 to CUSS:

##    GFE_ID    CENSUS_SITE GFE_TYPE SPECIES_QUALIFIED      Y_LAT      X_LONGT
## NA   2211 TATSATUA CREEK   Stream                CO 58.4400493 -132.3948794
##    FAZ_ACRO MAZ_ACRO  JAZ_ACRO          CU_NAME   CU_ACRO    CU_LAT    CU_LONGT
## NA     Taku     TBFj Taku+TBFj TAKU-LATE TIMING Taku-late 58.674303 -133.521737
##    CU_TYPE CU_INDEX FULL_CU_IN SBJ_ID POP_ID IS_INDICATOR
## NA Current       41      CO-41      2  45154            N
##                                              CMNTS EFFECTIVE_DT
## NA Conservation Unit Data-Rev. 4.1f_20-Apr 2017 -      3-Nov-11
##                                               WATERSHED_CDE
## NA 789-236300-00000-00000-0000-0000-000-000-000-000-000-000
##                                                                                                                                  FWA_WATERSHED_CDE
## NA 700-998191-999517-235623-000000-000000-000000-000000-000000-000000-000000-000000-000000-000000-000000-000000-000000-000000-000000-000000-000000
##    coordinates_changed SPECIES species_acronym_ncc  IndexId cuid
## NA               FALSE    Coho                  CO CO_45154 1015
##         cu_name_pse
## NA Taku-Late Timing
  • Any records of POP_ID = 45151 (Tatsamenie River lake-type sockeye) for 1994 or earlier get changed to POP_ID = 45153 (TATSATUA RIVER river-type sockeye)
##                                  cu_name_pse cuid  IndexId
## 1                                 Tatsamenie 1029 SX_45151
## 2 Northern Transboundary Fjords (river-type) 1023 SX_45153
##                              POPULATION      WATERBODY GFE_ID
## 1 Tatsamenie River (Alsek/Taku) Sockeye TATSATUA CREEK   2211
## 2   Tatsatua River (Alsek/Taku) Sockeye TATSATUA CREEK   2212

ISSUE: POP_ID = 45153 is already associated to another GFE_ID = 2212. However, the data points are almost duplicated. Noe that the ESTIMATE_METHOD for the alternative series in blue is “Unknown Estimate Method”:

##         WATERBODY Year IndexId.x MAX_ESTIMATE.x ESTIMATE_METHOD.x IndexId.y
## 1  TATSATUA CREEK 1985  SX_45151          13093 Fixed Site Census  SX_45153
## 2  TATSATUA CREEK 1986  SX_45151          11446 Fixed Site Census  SX_45153
## 3  TATSATUA CREEK 1987  SX_45151           2794 Fixed Site Census  SX_45153
## 4  TATSATUA CREEK 1988  SX_45151           2063 Fixed Site Census  SX_45153
## 5  TATSATUA CREEK 1989  SX_45151           3039 Fixed Site Census  SX_45153
## 6  TATSATUA CREEK 1990  SX_45151           5736 Fixed Site Census  SX_45153
## 7  TATSATUA CREEK 1991  SX_45151           8381 Fixed Site Census  SX_45153
## 8  TATSATUA CREEK 1992  SX_45151           6576 Fixed Site Census      <NA>
## 9  TATSATUA CREEK 1993  SX_45151           5028 Fixed Site Census      <NA>
## 10 TATSATUA CREEK 1994  SX_45151           4371 Fixed Site Census      <NA>
##    MAX_ESTIMATE.y       ESTIMATE_METHOD.y
## 1           12700 Unknown Estimate Method
## 2           11368 Unknown Estimate Method
## 3            2649 Unknown Estimate Method
## 4            2063 Unknown Estimate Method
## 5            3039 Unknown Estimate Method
## 6            5706 Unknown Estimate Method
## 7            8231 Unknown Estimate Method
## 8              NA                    <NA>
## 9              NA                    <NA>
## 10             NA                    <NA>

DECISION

  • For years <= 1994: change POP_ID = 45151 to POP_ID = 45153 and GFE_ID = 2211 to GFE_ID = 2212

  • Delete the existing series POP_ID = 45153 and GFE_ID = 2212

  • One record with POP_ID = 45165 (Chinook Run 2) change to 45164 (Chinook Run 1) in Nahlin river
##    IndexId cuid                              POPULATION    WATERBODY GFE_ID
## 1 CN_45164 1006 Nahlin River (Alsek/Taku) Chinook Run 1 NAHLIN RIVER   2224

Looks like the change was done in the process above, as shown in our records:

##    IndexId GFE_ID dataset
## 1 CN_45165   2224  NUSEDS
##                                                                                                             comment
## 1 The one data point was merged to series CN_45164 & GFE_ID = 2224 as it is the same CU: CU_NAME: TAKU-EARLY TIMING

5 Merge NUSEDS and CUSS & checks

Merge:

col_common <- colnames(all_areas_nuseds)[colnames(all_areas_nuseds) %in% colnames(conservation_unit_census_sites)]
col_common <- col_common[col_common != "species_acronym_ncc"]

col_nuseds <- colnames(all_areas_nuseds)[!colnames(all_areas_nuseds) %in% colnames(conservation_unit_census_sites)]

col_cuss <- colnames(conservation_unit_census_sites)[!colnames(conservation_unit_census_sites) %in% colnames(all_areas_nuseds)]

nuseds_final <- base::merge(x = all_areas_nuseds[,c(col_common,col_nuseds)], 
                            y = conservation_unit_census_sites[,c(col_common,col_cuss)], 
                            by = col_common, 
                            all.x = T)

There should not be any duplicate years, is that true?

## [1] "YES :-)"

All series in NUSEDS should be in CUSS, is that true?

## [1] "YES :-)"

All locations have coordinates, is that true?

## [1] "YES :-)"

And a same location does not has different coordinates. Is that true?

## [1] "YES :-)"

No series are missing a CU-association (i.e. missing both a FULL_CU_IN and a cuid), correct?

## [1] "YES :-)"

No IndeId/POP_ID is associated to multiple CUs, correct?

## [1] "YES :-)"

6 Export datasets

The following files are exported:

  • 1_NuSEDS_escapement_data_collated_DATE.csv: the cleaned combined NUSEDS and CUSS datasets

  • 1_series_inNUSEDS_noInCUSS_DATE.csv: information about the series in NUSEDS and not in CUSS

  • 1_series_removed_DATE.csv: the series removed from either NUSEDS or CUSS and why

  • 1_series_added_DATE.csv: the series added to either NUSEDS or CUSS and why

  • log_file.csv: the log file reporting the name of the main file exported, the date of the export, the name of the present script, the name of the original NUSEDS or CUSS files and the choices related to removing zeros or not.