The goal of the script is to merge the two constituent datasets
all_areas_nuseds (referred to as
NUSEDS) and
conservation_unit_census_sites (referred to as
CUSS) so each time series in NUSEDS
(i.e., fish counts associated to a unique populations, characterized by
a unique IndexId/POP_ID &
GFE_ID association) can be attributed to a unique
conservation unit (CU) in CUSS (characterized by
CU_NAME and FULL_CU_IN). But there are
multiple inconsistencies between the two datasets that lead to discard a
significant amount of data points in NUSEDS, unless
fixes are implemented. This script is a attempt to implement such
fixes.
The dataset version corresponding to this script is:
Fisheries and Oceans Canada. 2025. NuSEDS - New Salmon Escapement Database System. Pacific Biological Station. Updated in 2025-11-03 and downloaded in January 01, 2026, from https://open.canada.ca/data/en/dataset/c48669a3-045b-400d-b730-48aafe8c5ee6.
IMPORTANT NOTE: This is the script that PSF maintains actively and it is updated annually as new versions of NuSEDS are released. The corresponding script used in Atkinson et al. 2025. Monitoring for fisheries or for fish? Declines in monitoring of salmon spawners continue despite a conservation crisis is accessible at: a_nuseds_collation.Rmd.
The NUSEDS (all_areas_nuseds) and CUSS (conservation_unit_census_sites) datasets are imported and duplicated rows are removed. The columns considered for NuSEDS are related to the population, location, fish counts and year of assessment:
## [1] "File imported: All Areas NuSEDS_20251103.csv ; Date modified: 2026-01-08 09:04:18.530449"
## [1] "File imported: Conservation Unit Census Sites_20250528.csv ; Date modified: 2026-01-08 09:09:46.487137"
The list of conservation units (CUs) as shown in the Pacific Salmon
Explorer (PSE) is imported.
The columns cu_name_pse and cu_name_dfo
correspond to CU_NAME in CUSS,
cu_index corresponds to FULL_CU_INDEX.
## region species_name species_qualified cuid cu_name_pse
## 1 Skeena Sockeye SEL 171 Alastair
## 2 Skeena Sockeye SEL 174 Ecstall/Lower
## 3 Skeena Sockeye SEL 175 Johnston
## 4 Skeena Sockeye SEL 176 Kitsumkalum
## 5 Skeena Sockeye SEL 177 Lakelse
## 6 Skeena Sockeye SEL 172 Mcdonell/Dennis/Aldrich
## cu_name_dfo cu_index
## 1 Alastair SEL-20-01
## 2 Ecstall/Lower SEL-20-04
## 3 Johnston SEL-20-05
## 4 Kitsumkalum SEL-20-06
## 5 Lakelse SEL-20-07
## 6 Aldrich SEL-20-02
The DFO file of the stream locations and coordinates
(emailed from Wu Zhipeng, DFO, 09/04/2024) is imported. The columns
NME and ID correspond to
SYSTEM_SITE and GFE_ID in CUSS,
respectively.
## # A tibble: 6 × 4
## NME ID X_LONGT Y_LAT
## <chr> <dbl> <chr> <chr>
## 1 STANLEY CREEK 1557 -132.691420767338 53.9746116925906
## 2 DAVIDSON CREEK 1558 -132.682181062338 53.9544135568635
## 3 NADEN RIVER 1559 -132.672499482506 53.9448571061959
## 4 LIGNITE CREEK 1560 -132.601901312242 53.97114945092
## 5 OTUN RIVER 1561 -132.260106013 54.07328672
## 6 BILL CREEK 1562 -132.395009112518 53.7494926521361
The DFO hatchery data file (emailed from Brock
Ramshaw, DFO, 03/07/2023) is imported because it contains
GFE_ID (i.e., STOCK_GFE_ID and
REL_GFE_ID) and coordinates that are missing in the above
DFO stream location file:
## # A tibble: 6 × 8
## STOCK_GFE_ID STOCK_WATERBODY_NAME STOCK_LATITUDE STOCK_LONGITUDE REL_GFE_ID
## <dbl> <chr> <dbl> <dbl> <dbl>
## 1 1001 TANKEEAH RIVER 52.3 -128. 1001
## 2 1001 TANKEEAH RIVER 52.3 -128. 1001
## 3 1001 TANKEEAH RIVER 52.3 -128. 11611
## 4 1001 TANKEEAH RIVER 52.3 -128. 11611
## 5 1001 TANKEEAH RIVER 52.3 -128. 1001
## 6 1001 TANKEEAH RIVER 52.3 -128. 1001
## # ℹ 3 more variables: REL_WATERBODY_NAME <chr>, REL_LATITUDE <dbl>,
## # REL_LONGITUDE <dbl>
The regions and CUs shape files as defined in the PSE are imported:
We remove from NUSEDS and CUSS data
related to steelhead, Atlantic salmon and Kokanee salmon, corresponding
to 980 rows and 133 populations (POP_ID) in
NUSEDS and 0 in CUSS.
We create the column IndexId, which the the combination
of the species acronym (i.e., CO, CM, CN, PKO, PKE, SX) and
POP_ID:
## [1] "CN_48442" "CO_590" "CO_1794" "CO_1844" "CO_2172" "PKE_51264"
We create the column MAX_ESTIMATE which is the maximum
value of the fish count columns:
## [1] "NATURAL_ADULT_SPAWNERS" "NATURAL_JACK_SPAWNERS"
## [3] "NATURAL_SPAWNERS_TOTAL" "ADULT_BROODSTOCK_REMOVALS"
## [5] "JACK_BROODSTOCK_REMOVALS" "TOTAL_BROODSTOCK_REMOVALS"
## [7] "OTHER_REMOVALS" "TOTAL_RETURN_TO_RIVER"
The major issue in NuSEDS is the discrepancy between the two
datasets. Normally, there should be the same population references
(i.e., unique IndexId/POP_ID &
GFE_ID combinations). But there are 11591 unique population
references in NUSEDS and only 7149 in
CUSS. Additionally, there are 4447 references in
NUSEDS that are not in CUSS and 5 that
are in CUSS but not in NUSEDS. This
section consists in solving these discrepancies so that the same
population references are present in both datasets.
There are 60 duplicated rows in NUSEDS when considering the following fields:
## [1] "SPECIES" "POP_ID"
## [3] "GFE_ID" "Year"
## [5] "NATURAL_ADULT_SPAWNERS" "NATURAL_JACK_SPAWNERS"
## [7] "NATURAL_SPAWNERS_TOTAL" "ADULT_BROODSTOCK_REMOVALS"
## [9] "JACK_BROODSTOCK_REMOVALS" "TOTAL_BROODSTOCK_REMOVALS"
## [11] "OTHER_REMOVALS" "TOTAL_RETURN_TO_RIVER"
## [13] "ENUMERATION_METHODS" "ESTIMATE_CLASSIFICATION"
## SPECIES POP_ID GFE_ID Year MAX_ESTIMATE
## 1 Sockeye 44565 2489 1997 NA
## 2 Sockeye 44565 2489 1998 NA
## 3 Sockeye 44565 2489 2001 NA
## 4 Sockeye 44565 2489 2002 NA
## 5 Coho 44567 2489 1995 NA
## 6 Coho 44567 2489 1996 NA
## 7 Coho 44567 2489 1997 1
## 8 Coho 44567 2489 1998 12
## 9 Coho 44567 2489 1999 NA
## 10 Coho 44567 2489 2000 NA
## 11 Coho 44567 2489 2001 NA
## 12 Coho 44567 2489 2002 NA
## 13 Coho 44567 2489 2003 NA
## 14 Coho 44567 2489 2004 NA
## 15 Coho 44567 2489 2005 NA
## 16 Coho 44567 2489 2007 NA
## 17 Coho 44567 2489 2008 NA
## 18 Coho 44567 2489 2009 NA
## 19 Coho 44567 2489 2010 NA
## 20 Coho 44567 2489 2011 NA
## 21 Coho 44567 2489 2012 NA
## 22 Coho 44567 2489 2013 NA
## 23 Coho 44567 2489 2014 NA
## 24 Coho 44567 2489 2015 NA
## 25 Coho 44567 2489 2016 NA
## 26 Coho 44567 2489 2017 NA
## 27 Coho 44567 2489 2018 NA
## 28 Coho 44567 2489 2019 NA
## 29 Pink 44569 2489 1997 NA
## 30 Pink 44569 2489 1998 NA
## 31 Pink 44569 2489 2001 NA
## 32 Pink 44569 2489 2002 NA
## 33 Chum 44571 2489 1995 NA
## 34 Chum 44571 2489 1996 NA
## 35 Chum 44571 2489 1997 126
## 36 Chum 44571 2489 1998 93
## 37 Chum 44571 2489 1999 NA
## 38 Chum 44571 2489 2000 NA
## 39 Chum 44571 2489 2001 NA
## 40 Chum 44571 2489 2002 NA
## 41 Chum 44571 2489 2003 NA
## 42 Chum 44571 2489 2004 NA
## 43 Chum 44571 2489 2005 NA
## 44 Chum 44571 2489 2008 NA
## 45 Chum 44571 2489 2009 NA
## 46 Chum 44571 2489 2010 NA
## 47 Chum 44571 2489 2011 NA
## 48 Chum 44571 2489 2012 NA
## 49 Chum 44571 2489 2013 NA
## 50 Chum 44571 2489 2014 NA
## 51 Chum 44571 2489 2015 NA
## 52 Chum 44571 2489 2016 NA
## 53 Chum 44571 2489 2017 NA
## 54 Chum 44571 2489 2018 NA
## 55 Chum 44571 2489 2019 NA
## 56 Chinook 44573 2489 1997 NA
## 57 Chinook 44573 2489 1998 NA
## 58 Chinook 44573 2489 2001 NA
## 59 Chinook 44573 2489 2002 NA
## 60 Chum 51316 1174 2011 NA
We assume these are real duplicated rows in NUSEDS so we remove them.
For CUSS, 0 duplicated rows are found:
## [1] GFE_ID CENSUS_SITE GFE_TYPE
## [4] SPECIES_QUALIFIED Y_LAT X_LONGT
## [7] FAZ_ACRO MAZ_ACRO JAZ_ACRO
## [10] CU_NAME CU_ACRO CU_LAT
## [13] CU_LONGT CU_TYPE CU_INDEX
## [16] FULL_CU_IN SBJ_ID POP_ID
## [19] IS_INDICATOR CMNTS EFFECTIVE_DT
## [22] WATERSHED_CDE FWA_WATERSHED_CDE coordinates_changed
## [25] SPECIES species_acronym_ncc IndexId
## <0 rows> (or 0-length row.names)
These rows (if any) are removed as well.
There are 2 instances where multiple MAX_ESTIMATE values
are available in a same Year for a given population:
## IndexId GFE_ID Year MAX_ESTIMATE ESTIMATE_METHOD
## 1 CN_45164 2224 2023 3957.000000 Sonar-ARIS
## 2 CN_45164 2224 2023 462.000000 Peak Live + Dead
## 3 CN_46447 241 2016 494.000000 Calibrated Time Series
## 4 CN_46447 241 2016 475.414455 Calibrated Time Series
## ESTIMATE_CLASSIFICATION
## 1 TRUE ABUNDANCE (TYPE-1)
## 2 RELATIVE ABUNDANCE (TYPE-4)
## 3 RELATIVE ABUNDANCE (TYPE-4)
## 4 RELATIVE ABUNDANCE (TYPE-4)
For CN_45164, we keep the observation with the largest
count because the method associated is better (TYPE 1 vs. TYPE 4).
For CN_46447, there are different
CREATED_DTT and UPDATED_DTT for the two
entries:
## IndexId GFE_ID Year NATURAL_ADULT_SPAWNERS ESTIMATE_METHOD
## 1 CN_46447 241 2016 494.000000 Calibrated Time Series
## 2 CN_46447 241 2016 475.414455 Calibrated Time Series
## CREATED_DTT UPDATED_DTT
## 1 06-AUG-21 26-MAR-25
## 2 16-APR-24
We keep the observation with the most recent entry: the one with
UPDATED_DTT = 26-MAR-25.
There is one instance of a location (GFE_ID) that does
not have values for
FAZ_ACRO,MAZ_ACRO,JAZ_ACRO:
## CENSUS_SITE GFE_ID FAZ_ACRO MAZ_ACRO JAZ_ACRO
## 1 PUNTLEDGE RIVER 1156 EVI GStr EVI+GStr
## 2 PUNTLEDGE RIVER 1156
We edit CUSS accordingly.
There are 9 locations without coordinates in any of the source files and for which we have to define coordinates manually (cf., code to see the sources when available):
## CENSUS_SITE GFE_ID X_LONGT
## 1 NICOLA RIVER (DAM) 54282 -120.665873
## 2 NICOLA RIVER (DOT) 54283 -121.315659
## 3 BIRCH ISLAND CHANNEL 33045 -119.889197
## 4 BLUE RIVER - UPPER 1921661712 -119.289006
## 5 MIDDLE NORTH THOMPSON RIVER MAINSTEM 1754239945 -119.701668
## 6 PIG CHANNEL COMBINED 719256264 119.810014
## 7 UPPER NORTH THOMPSON RIVER MAINSTEM 2049268617 -119.177796
## 8 UPPER NORTH THOMPSON RIVER-MILEDGE CONFLUENCE 57041 -119.173514
## 9 CHILCOTIN RIVER - UPPER 2462 -124.028396
## Y_LAT
## 1 50.163045
## 2 50.424454
## 3 51.598602
## 4 52.112252
## 5 51.593091
## 6 51.579465
## 7 52.345236
## 8 52.281312
## 9 52.339295
We remove these time series to reduce the amount of cleaning to do after. There is the option to remove series with (1) only NAs and 0s or (2) only NAs.
Decision made:
## [1] "Time series uniqually made of NAs are removed (those also containing 0s are kept)."
This procedure removes 102104 rows in NUSEDS, corresponding to 24.4% of the original dataset.
The data removed concerns 4425 time series, which are referenced in
removed_all:
## IndexId GFE_ID dataset comment
## 1 CN_134 2615 NUSEDS Only NAs for MAX_ESTIMATE
## 2 SX_142 2615 NUSEDS Only NAs for MAX_ESTIMATE
## 3 CN_144 2624 NUSEDS Only NAs for MAX_ESTIMATE
## 4 SX_152 2624 NUSEDS Only NAs for MAX_ESTIMATE
## 5 CN_154 2625 NUSEDS Only NAs for MAX_ESTIMATE
## 6 SX_162 2625 NUSEDS Only NAs for MAX_ESTIMATE
Note that we do not remove the references of these series (i.e.,
their IndexId & GFE_ID association) in
CUSS in order to not loose information.
The goal is to look for each IndexId/POP_ID &
GFE_ID series in NUSEDS that are not in
CUSS and to proceed as follow:
The series with a IndexId/POP_ID associate with a
FULL_CU_IN (field added in NUSEDS since
2025-11-03) are added to CUSS. The step might require
to add a new IndexId/POP_ID and/or GFE_ID to
CUSS.
The missing FULL_CU_IN (and consequently
CU_NAME and other related fields) of the remaining series
is found by intersecting the coordinates (X_LONGT and
Y_LAT) associated with the GFE_ID with the
PSE’s CUs’ shape files. Several GFE_ID are missing
coordinates so we look in the DFO stream file and the DFO hatchery file
as alternative sources. Once associated to a CU, these series are added
to CUSS.
There are 264 IndexId & GFE_ID time
series in NUSEDS whose references are not present in
CUSS, among which 53 already have a
FULL_CU_IN.
Among the 194 locations (GFE_ID) associated to these
series, 127 are present with coordinates in CUSS, 40
had to be found in the DFO file of the stream locations, 4 were found in
the DFO hatchery file and 23 were not found anywhere. In the next
section, we attribute coordinates to the latter manually.
There are 23 locations with coordinates we manually defined using Google Maps or any resource we could find.
There are two Clearwater Creek WATERBODY
associated to Phillips River (Campbell River Area)
POPULATION in NUSEDS:
## WATERBODY GFE_ID
## 1 CLEARWATER CREEK 2618
## 2 Clearwater Creek 446068199
None of the two GFE_ID are in CUSS
cond <- conservation_unit_census_sites$GFE_ID %in% c(2618,446068199)
show <- unique(conservation_unit_census_sites[cond,c("CENSUS_SITE","GFE_ID","X_LONGT","Y_LAT")])
show <- as.data.frame(show)
rownames(show) <- NULL
show
## [1] CENSUS_SITE GFE_ID X_LONGT Y_LAT
## <0 rows> (or 0-length row.names)
One of them is in the DFO file of the stream locations:
cond <- DFO_All_Streams_Segments$ID %in% c(2618,446068199)
show <- unique(DFO_All_Streams_Segments[cond,c("NME","ID","X_LONGT","Y_LAT")])
show <- as.data.frame(show)
rownames(show) <- NULL
show
## NME ID X_LONGT Y_LAT
## 1 CLEARWATER CREEK 2618 -125.276480824 50.6158337647
By comparing all the time series having the same
indexId/POP_ID with those two GFE_ID, we can
see all of them can be merged:
We consequently assume that the two locations are the same and we
merge the time time series by replacing the GFE_ID value
without coordinates (446068199) by the one with coordinates
(2618) in NUSEDS.
## [1] "Series SX_50610 - 446068199 merged to series SX_50610 - 2618 in NUSEDS"
## [1] "3 rows were edited in all_areas_nuseds at the following fields: AREA, WATERBODY, GAZETTED_NAME, LOCAL_NAME_1, LOCAL_NAME_2, WATERSHED_CDE, WATERBODY_ID, GFE_ID, FWA_WATERSHED_CDE"
## [1] ""
## [1] "Series CO_50612 - 446068199 merged to series CO_50612 - 2618 in NUSEDS"
## [1] "1 rows were edited in all_areas_nuseds at the following fields: AREA, WATERBODY, GAZETTED_NAME, LOCAL_NAME_1, LOCAL_NAME_2, WATERSHED_CDE, WATERBODY_ID, GFE_ID, FWA_WATERSHED_CDE"
## [1] ""
## [1] "Series CM_50616 - 446068199 merged to series CM_50616 - 2618 in NUSEDS"
## [1] "1 rows were edited in all_areas_nuseds at the following fields: AREA, WATERBODY, GAZETTED_NAME, LOCAL_NAME_1, LOCAL_NAME_2, WATERSHED_CDE, WATERBODY_ID, GFE_ID, FWA_WATERSHED_CDE"
## [1] ""
## [1] "Series CN_50618 - 446068199 merged to series CN_50618 - 2618 in NUSEDS"
## [1] "1 rows were edited in all_areas_nuseds at the following fields: AREA, WATERBODY, GAZETTED_NAME, LOCAL_NAME_1, LOCAL_NAME_2, WATERSHED_CDE, WATERBODY_ID, GFE_ID, FWA_WATERSHED_CDE"
## [1] ""
## [1] "Series PKO_50614 - 446068199 merged to series PKO_50614 - 2618 in NUSEDS"
## [1] "2 rows were edited in all_areas_nuseds at the following fields: AREA, WATERBODY, GAZETTED_NAME, LOCAL_NAME_1, LOCAL_NAME_2, WATERSHED_CDE, WATERBODY_ID, GFE_ID, FWA_WATERSHED_CDE"
## [1] ""
There is a typo for Cariboo river:
cond <- all_areas_nuseds$GFE_ID == 2467
show <- unique(all_areas_nuseds[cond,c("WATERBODY","GFE_ID","IndexId","POPULATION")])
rownames(show) <- NULL
show
## WATERBODY GFE_ID IndexId
## 1 CARRIBOO RIVER (LOWER) 2467 CN_46891
## POPULATION
## 1 Cariboo River (Williams Lake Area) Chinook Run 1
We correct it to CARIBOO RIVER - LOWER in
NUSEDS to match we other notations of the same
river:
cond <- grepl("CARRIBOO",all_areas_nuseds$WATERBODY)
all_areas_nuseds$WATERBODY[cond] <- "CARIBOO RIVER - LOWER"
cond <- grepl("CARRIBOO",trackRecord_nuseds_nocuss$WATERBODY)
trackRecord_nuseds_nocuss$WATERBODY[cond] <- "CARIBOO RIVER - LOWER"
cond <- grepl("CARIBOO",all_areas_nuseds$WATERBODY)
show <- unique(all_areas_nuseds[cond,c("WATERBODY","GFE_ID")])
rownames(show) <- NULL
show
## WATERBODY GFE_ID
## 1 CARIBOO RIVER - UPPER 2466
## 2 CARIBOO RIVER 290
## 3 CARIBOO RIVER - LOWER 2467
Here are the locations for which we manually defined coordinates:
## GFE_ID WATERBODY X_LONGT Y_LAT
## 1 11487 STAMP RIVER BELOW FALLS -124.911882 49.327917
## 2 11488 SPROAT RIVER BELOW FALLS -124.902346 49.290073
## 3 7990588 RETURN CHANNEL CREEKS -128.135232 52.276281
## 4 1303 KENNEDY LAKE BEACHES -125.580441 49.049133
## 5 52749 SILVERMERE LAKE -122.408753 49.174967
## 6 55417 CAMILOS CREEK -121.393056 49.386667
## 7 57042 HANSEN CREEK -128.346389 50.778889
## 8 2464 COTTONWOOD RIVER - LOWER -122.610584 53.119965
## 9 2467 CARIBOO RIVER - LOWER -121.444278 52.721027
## 10 19723 SKEENA RIVER - NORTH -128.636866 54.501018
## 11 432 WIMINASIK LAKE SYSTEM -128.049384 56.489746
## 12 2461 NICOLA RIVER - LOWER -121.021924 50.145281
## 13 95 SEMMIHAULT CREEK -121.956111 49.153056
## 14 2681 MCLELLAN CREEK -126.619731 52.38783
## 15 67553 GIBSON CREEK -123.499173 49.406058
## 16 208 COTTONWOOD CREEK -123.145631 49.762523
## 17 199 FEE CREEK -123.342487 50.068778
## 18 489440637 Atlatzi+Clear Rivers -126.11918 51.015775
## 19 3505 OLSEN CREEK -122.631537 49.597168
## 20 1307 CLAYOQUOT ARM BEACHES -125.591878 49.099033
## 21 2792 MOORE LAKE -129.501134 53.40953
## 22 64745 ATNARKO LAKES -125.713941 52.163834
There are 208 IndexId/POP_ID not associated to a CU (and
211 time series):
## IndexId POPULATION
## 1 CM_1622 Maple Creek (Coquitlam) Chum Run 1
## 2 CM_2434 Stoney Creek (Coquitlam) Chum Run 1
## 3 CM_297 Alouette River (Coquitlam) Chum Run 2
## 4 CM_3212 Clayoquot River Chum
## 5 CM_41256 Satchie Creek Chum
## 6 CM_41371 Oktwanch River Chum
## 7 CM_42458 Tlell River (Sandspit) Chum
## 8 CM_43568 Cascade Creek (Lower Nass) Chum
## 9 CM_44524 Hesquiat Point Creek Chum
## 10 CM_44558 Muriel S1 Creek Chum
## 11 CM_45283 Draw Creek Chum
## 12 CM_45393 Carrington Creek (Campbell River Area) Chum Run 1
## 13 CM_45644 Atnarko Spawning Channel (Bella Coola) Chum
## 14 CM_45931 Return Channel Creeks (Bella Bella) Chum
## 15 CM_46787933 Muriel Lake Creek Chum
## 16 CM_47648 Damshilgwit Creek (Smithers) Chum
## 17 CM_51026 Health Lagoon Creek (Alert Bay) Chum Run 1
## 18 CM_51161 Carriden Creek (Alert Bay) Chum Run 1
## 19 CM_52086 Kennedy Lake Chum
## 20 CM_52096 Kennedy Lake Feeder Streams Chum
## 21 CM_52171 Cold Creek Chum
## 22 CM_52186 Angora Creek Chum
## 23 CM_562 Byrne Creek (Coquitlam) Chum Run 1
## 24 CM_7009 Irgens Creek (Pender Harbour) Chum
## 25 CM_7734 Coho Creek (Mission) Chum
## 26 CM_7735 Eagle Creek (Mission) Chum
## 27 CM_7736 Holachten Creek (Mission) Chum
## 28 CM_7737 Mystery Creek (Mission) Chum
## 29 CM_7739 Silvermere Lake (Mission) Chum
## 30 CM_7765 Coho Creek (Tofino) Chum
## 31 CM_7768 Cultus Lake (Chilliwack) Chum
## 32 CM_7769 Camilos Creek (Chilliwack) Chum
## 33 CM_7774 English Cove Creek (Tofino) Chum
## 34 CM_7775 Noosgulch River (Bella Coola) Chum
## 35 CM_7784 Hansen Creek (Campbell River) Chum
## 36 CM_7796 Fancy Cove (Head) Creek (Bella Bella) Chum
## 37 CM_7801 Colvin Creek (Pender Harbour) Chum
## 38 CM_7812 Fancy Cove Right Hand Creek (Bella Bella) Chum
## 39 CM_7816 Fell Creek (Bella Bella) Chum
## 40 CM_7827 Malcolm Creek Chum
## 41 CN_3213 Clayoquot River Chinook
## 42 CN_3320 Lower Stamp River Chinook
## 43 CN_3324 Upper Stamp Chinook
## 44 CN_3334 Nanaimo River Chinook (unknown timing)
## 45 CN_41103 White Pine Cove Creek Chinook
## 46 CN_45645 Atnarko Spawning Channel (Bella Coola) Chinook
## 47 CN_46658 Dunn Creek (Clearwater) Chinook Run 1
## 48 CN_46668 Mctaggart Creek (Clearwater) Chinook Run 1
## 49 CN_46841 Chilcotin River (Williams Lake Area) Chinook Run 1
## 50 CN_47156 Anderson River (Chilliwack) Chinook
## 51 CN_49102 Hanson's Creek (Campbell River Area) Chinook Run 1
## 52 CN_50009 Upper Paradise Channel (Squamish) Chinook
## 53 CN_50319 Hugh Creek (Kitimat-Butedale) Chinook
## 54 CN_50638 Grassy Creek (Campbell River Area) Chinook Run 1
## 55 CN_50698 Wortley Creek (Campbell River Area) Chinook Run 1
## 56 CN_7809 Okanagan River (Salmon Arm) Summer Chinook
## 57 CN_7834 Whiminasik Lake System (Upper Nass) Chinook
## 58 CO_1624 Maple Creek (Coquitlam) Coho
## 59 CO_1826 Nevin Creek (Chilliwack) Coho
## 60 CO_3027 Wathlsto Creek (Kitimat-Butedale) Coho
## 61 CO_3211 Clayoquot River Coho
## 62 CO_3321 Upper Stamp Coho
## 63 CO_43196 Douglas Inlet Head Creek (West Coast QCI) Coho
## 64 CO_43616 Donahue Creek (Lower Nass) Coho
## 65 CO_44441 Deep Creek (Smithers) Coho
## 66 CO_44840 Crag Creek (Lower Nass) Coho
## 67 CO_44988 Palmer Creek (Salmon Arm) Coho
## 68 CO_45303 Klemtu Creek (Bella Bella) Coho
## 69 CO_45642 Atnarko Spawning Channel (Bella Coola) Coho
## 70 CO_45879 Whaletown Creek (Campbell River Area) Coho Run 1
## 71 CO_46310 Ross Creek (Salmon Arm) Coho
## 72 CO_46855 Taseko Lake (Williams Lake) Coho
## 73 CO_46865 Elkin Creek (Williams Lake) Coho
## 74 CO_47232 Yalakom River (Lillooet) Coho
## 75 CO_48090 Shannon Creek (Terrace) Coho
## 76 CO_48205 Anudol Creek (Upper Nass) Coho
## 77 CO_49415 Lower Tahlo Creek (Smithers) Coho
## 78 CO_52082 Kennedy Lake Coho
## 79 CO_52242 Meares Creek Coho
## 80 CO_52262 Sharp Creek Coho
## 81 CO_52272 Cone Creeks (2) Coho
## 82 CO_52376 Mckay Creek (Kitimat-Butedale) Coho
## 83 CO_572 Calkins Creek (Chilliwack) Coho
## 84 CO_7420 Teto Creek (Salmon Arm) Coho
## 85 CO_7729 Clayton Falls Creek (Bella Coola) Coho
## 86 CO_7730 Bings Creek (Cowichan) Coho
## 87 CO_7747 Bear Creek (Comox) Coho
## 88 CO_7759 Averill Creek (Cowichan) Coho
## 89 CO_7764 Coho Creek (Tofino) Coho
## 90 CO_7771 Blue Lead Creek (Williams Lake) Coho
## 91 CO_7772 Wasko Creek (Williams Lake) Coho
## 92 CO_7773 English Cove Creek (Tofino) Coho
## 93 CO_7776 Jenny Bay West Creek (Bella Coola) Coho
## 94 CO_7779 Quilchena Creek (Lillooet) Coho
## 95 CO_7780 Hazeltine Creek (Williams Lake) Coho
## 96 CO_7793 Mclellan Creek (Bella Coola) Coho
## 97 CO_7802 Colvin Creek (Pender Harbour) Coho
## 98 CO_7814 Fannie Cove Left Hand Creek (Bella Bella) Coho
## 99 CO_7826 Malcolm Creek Coho
## 100 CO_7828 Gibson Creek Coho
## 101 CO_792 Cottonwood Creek (Squamish) Coho Run 1
## 102 CO_959 Fee Creek (Squamish) Coho Run 1
## 103 PKE_180 Myers Creek (Pender Harbour) Pink Run 1
## 104 PKE_3167 Nooklikonnik Creek (Bella Coola) Pink
## 105 PKE_3168 Nusatsum River (Bella Coola) Pink
## 106 PKE_3277 Dak River Pink
## 107 PKE_3432 Gobeil Bay Creek Pink
## 108 PKE_3447 Fannie Cove Right Hand Creek (Bella Bella) Pink
## 109 PKE_45304 Klemtu Creek (Bella Bella) Pink
## 110 PKE_45454 Babine Lake (Smithers) Pink
## 111 PKE_45643 Atnarko Spawning Channel (Bella Coola) Pink
## 112 PKE_45930 Return Channel Creeks (Bella Bella) Pink
## 113 PKE_48638 Roberts Creek (Pender Harbour) Pink
## 114 PKE_48658 Wilson Creek (Pender Harbour) Pink
## 115 PKE_50012 Tenderfoot Creek (Squamish) Pink
## 116 PKE_51064 Wahkana Bay Creek (Alert Bay) Pink Run 1
## 117 PKE_51314 Cook Creek Pink
## 118 PKE_51474 Departure Creek (Nanaimo) Pink Run 1
## 119 PKE_51882 Rainbow Creek (Bella Bella) Pink
## 120 PKE_52377 Mckay Creek (Kitimat-Butedale) Pink
## 121 PKE_52387 Tetlock Creek (Kitimat-Butedale) Pink
## 122 PKE_52392 Hunter Creek (Kitimat-Butedale) Pink
## 123 PKE_52492 Alvin Creek (Grenville Principe) Pink
## 124 PKE_52754 Owen Creek (Campbell River Area) Pink Run 1
## 125 PKE_53044 Little River (Comox) Pink Run 1
## 126 PKE_54793433 Dominic Creek (Quatsino Sound) Pink Run 1
## 127 PKE_7797 Fancy Cove (Head) Creek (Bella Bella) Pink
## 128 PKE_7824 Colvin Creek (Pender Harbour) Pink
## 129 PKE_88092422 Dasque Creek (Terrace) Pink
## 130 PKO_180 Myers Creek (Pender Harbour) Pink Run 1
## 131 PKO_3195 Whitebottom Creek Pink
## 132 PKO_3277 Dak River Pink
## 133 PKO_40049 Goldstream River (Victoria/Saanich) Pink
## 134 PKO_41037 Kumdis Creek (Masset) Pink
## 135 PKO_43007 Bag Harbour Creek (Sandspit) Pink
## 136 PKO_44654 Gaspard Creek (Lillooet) Pink
## 137 PKO_45643 Atnarko Spawning Channel (Bella Coola) Pink
## 138 PKO_45930 Return Channel Creeks (Bella Bella) Pink
## 139 PKO_49088 Store Creek (Powell River Area) Pink Run 1
## 140 PKO_50007 Upper Paradise Channel (Squamish) Pink
## 141 PKO_50012 Tenderfoot Creek (Squamish) Pink
## 142 PKO_50154 South Twin Creek (Vancouver) Pink Run 1
## 143 PKO_52377 Mckay Creek (Kitimat-Butedale) Pink
## 144 PKO_52392 Hunter Creek (Kitimat-Butedale) Pink
## 145 PKO_52492 Alvin Creek (Grenville Principe) Pink
## 146 PKO_52754 Owen Creek (Campbell River Area) Pink Run 1
## 147 PKO_53044 Little River (Comox) Pink Run 1
## 148 PKO_7792 Noosgulch River (Bella Coola) Pink
## 149 PKO_7797 Fancy Cove (Head) Creek (Bella Bella) Pink
## 150 PKO_7811 Bullock Channel East Creek (Bella Bella) Pink
## 151 PKO_7813 Fancy Cove Right Hand Creek (Bella Bella) Pink
## 152 PKO_7817 Fell Creek (Bella Bella) Pink
## 153 PKO_88092422 Dasque Creek (Terrace) Pink
## 154 SX_1874 Olsen Creek (Coquitlam) Early Summer Sockeye Run 1
## 155 SX_3077 Butterfield Creek Early Stuart Sockeye
## 156 SX_3085 Tarnezell Creek Early Stuart Sockeye
## 157 SX_3092 Nancut Creek Early Stuart Sockeye
## 158 SX_3221 Sus Creek Early Summer Sockeye
## 159 SX_3438 Loftus Creek Late Sockeye
## 160 SX_39730 East Creek (Quatsino Sound) Sockeye Run 1
## 161 SX_40015 Cowichan River (Duncan) Sockeye Run 1
## 162 SX_42680 Security Inlet Left Hand Creek (West Coast QCI) Sockeye
## 163 SX_42950 Werner Bay South Right Hand Creek (Sandspit) Sockeye
## 164 SX_43095 Louscoone Inlet Creek (West Coast QCI) Sockeye
## 165 SX_43770 Webster Lake Creek (Bella Bella) Sockeye
## 166 SX_44140 Gribble Island Creek (Kitimat-Butedale) Sockeye
## 167 SX_44160 Fishtrap Bay Creek (Kitimat-Butedale) Sockeye
## 168 SX_44165 Little Tillhorn Creek (Kitimat-Butedale) Sockeye
## 169 SX_45013 Burton Creek (Clearwater) Early Summer Sockeye
## 170 SX_45038 Raft Creek (Williams Lake Area) Summer Sockeye
## 171 SX_45040 Winkley Creek (Williams Lake Area) Summer Sockeye
## 172 SX_45056 Amos Creek (Williams Lake Area) Summer Sockeye
## 173 SX_45084 Upper Momich River (Clearwater) Early Summer Sockeye
## 174 SX_45641 Atnarko Spawning Channel (Bella Coola) Sockeye
## 175 SX_46248 Adams River (Upper) (Clearwater) Early Summer Sockeye
## 176 SX_46268 Momich/Cayenne (Clearwater) Early Summer Sockeye
## 177 SX_46278 Cayenne Creek (Clearwater) Early Summer Sockeye
## 178 SX_46279 Cayenne Creek (Clearwater) Late Sockeye
## 179 SX_46308 Ross Creek (Salmon Arm) Early Summer Sockeye
## 180 SX_48264 Tseax Slough (Upper Nass) Sockeye
## 181 SX_50005 Upper Paradise Channel (Squamish) Sockeye
## 182 SX_50010 Tenderfoot Creek (Squamish) Sockeye
## 183 SX_50340 Wathl Creek (Kitimat-Butedale) Sockeye
## 184 SX_51000 Gilford Creek (Alert Bay) Sockeye Run 1
## 185 SX_51060 Wahkana Bay Creek (Alert Bay) Sockeye Run 1
## 186 SX_51145 Embley Creek (Alert Bay) Sockeye Run 1
## 187 SX_51155 Carriden Creek (Alert Bay) Sockeye Run 1
## 188 SX_52080 Kennedy Lake Sockeye
## 189 SX_52120 Clayoquot Arm Beaches Sockeye
## 190 SX_52390 Hunter Creek (Kitimat-Butedale) Sockeye
## 191 SX_52750 Owen Creek (Campbell River Area) Sockeye Run 1
## 192 SX_52890 Menzies Creek (Campbell River Area) Sockeye Run 1
## 193 SX_52960 Willow Creek (Campbell River Area) Sockeye Run 1
## 194 SX_7490 Middle River Sockeye
## 195 SX_7491 Stephens Creek Sockeye
## 196 SX_7639 Four Mile Creek (Salmon Arm) Late Sockeye
## 197 SX_7640 Hlina Creek (Salmon Arm) Late Sockeye
## 198 SX_7689 Blue Creek (Coquitlam) Early Summer Sockeye
## 199 SX_7694 Cypress Creek (Coquitlam) Early Summer Sockeye
## 200 SX_7805 Moore Lake (Grenville Principe) Sockeye
## 201 SX_7807 Sawki Creek (Mission) Sockeye
## 202 SX_7815 Fannie Cove Left Hand Creek (Bella Bella) Sockeye
## 203 SX_7818 Atnarko Lakes Sockeye
## 204 SX_7822 Joffre Creek Sockeye
## 205 SX_7823 Ormond Creek (Prince George) Early Summer Sockeye
## 206 SX_7833 Whiminaski Lake System (Upper Nass) Sockeye
## 207 SX_7835 Illiance River (Lower Nass) Sockeye
## 208 SX_80808204 Dasque Creek (Terrace) Sockeye
We identify the CU they belong to using the CUs’ shape files of the
PSE. Series with X_LONGT and Y_LAT values that
intersects one unique CU layer of the same species are attributed the
CU’s FULL_CU_IN. When more than one CU layer is
intersected, we use the information in POPULATION and
WATERBODY to manually select the correct CU if
possible.
Below we show the 0 time series for which we could not attribute a CU:
## [1] "There are not any."
There are 3 CUs (5 time series) having a cu_name_pse and
cuid but for who we could not find a CU_NAME
and FULL_CU_IN in CUSS:
## [1] "The series below is/are associated to CU Upper Skeena (cuid = 242) in the PSE:"
## [1] "The series below is/are associated to CU Adams-Early Summer (cuid = 760) in the PSE:"
## [1] "The series below is/are associated to CU Momich-Early summer (cuid = 761) in the PSE:"
Finally, there are 2 time series with a CU_NAME and
FULL_CU_IN that are not in CUSS and for
which we could not find the corresponding cuid and
cu_name_pse in the PSE:
## IndexId GFE_ID FULL_CU_IN WATERBODY Y_LAT X_LONGT
## 1 SX_3302 3416 SEL-13-xx GREAT CENTRAL LAKE 49.32764265 -124.9826436
## 2 SX_3302 3444 SEL-13-xx SPROAT LAKE 49.28801087 -124.9153347
## coordinates_from POPULATION CU_NAME species_name cuid
## 1 CUSS Somass System Sockeye GREAT CENTRAL/SPROAT Sockeye NA
## 2 CUSS Somass System Sockeye GREAT CENTRAL/SPROAT Sockeye NA
## cu_name_pse
## 1 <NA>
## 2 <NA>
## comment
## 1 CUID not found: IndexId = SEL-13-xx and CU_NAME = GREAT CENTRAL/SPROAT
## 2 CUID not found: IndexId = SEL-13-xx and CU_NAME = GREAT CENTRAL/SPROAT
## comment_cu
## 1 <NA>
## 2 <NA>
## [1] "The series below have CU_NAME = GREAT CENTRAL/SPROAT and FULL_CU_IN = SEL-13-xx"
The reference of all these time series (i.e.,
IndexId/POP_ID & GFE_ID and associated
fields) are added to CUSS in the next step.
First we add to CUSS the references of the 252 time
series for which we found the CU_NAME and
FULL_CU_IN in CUSS and the
cuid and cu_name_pse in the PSE.
Second we add to CUSS the references of the 5 time
series for which we did not find the CU_NAME and
FULL_CU_IN but for which we found the cuid and
cu_name_pse.
Finally, the 2 time series with a CU_NAME and
FULL_CU_IN for which we could not find the corresponding
cuid and cu_name_pse are added to
CUSS.
So now all the time series in NUSEDS should be referenced in CUSS, is that true?
## [1] "YES :-)"
In this section we attempt to find the PSE’s cuid and
cu_name_pse to all CUs (FULL_IN_IN and
CU_NAME) in CUSS. We then fill the missing
information for the fields FULL_CU_IN, cuid
and cu_name_pse in NUSEDS.
There are 5 time series without a FULL_CU_IN in
NUSEDS and they should be the ones identified above, is
that the case?
## [1] "YES :-)"
There are 86 time series and 22 FULL_CU_IN in
NUSEDS for which we could not find a cuid
and cu_name_pse:
## IndexId GFE_ID WATERBODY FULL_CU_IN
## 1 CM_1863 256 NORTH THOMPSON RIVER CM-9004
## 2 CM_383 141 BAKER CREEK CM-9004
## 3 CM_46972 55 RAILWAY CREEK CM-9004
## 4 CM_47113 109 RUBY CREEK CM-9004
## 5 CM_47116 110 HUNTER CREEK CM-9004
## 6 CM_47120 111 SILVERHOPE CREEK CM-9004
## 7 CM_47124 112 COQUIHALLA RIVER CM-9004
## 8 CM_47131 113 SUCKER CREEK CM-9004
## 9 CM_47135 114 KAWKAWA LAKE CM-9004
## 10 CM_47136 115 STEVEN CREEK CM-9004
## 11 CM_47137 116 MENZ CREEK CM-9004
## 12 CM_47138 117 KOPP CREEK CM-9004
## 13 CM_47142 118 AMERICAN CREEK CM-9004
## 14 CM_47143 119 SQUEAH LAKE CREEK CM-9004
## 15 CM_47146 120 EMORY CREEK CM-9004
## 16 CM_47149 122 YALE CREEK CM-9004
## 17 CM_47152 123 SPUZZUM CREEK CM-9004
## 18 CM_47155 124 ANDERSON RIVER CM-9004
## 19 CM_47177 127 SEVEN MILE CREEK CM-9004
## 20 CM_47180 128 TEXAS CREEK CM-9004
## 21 CM_47197 7990579 SETON AND CAYOOSH CREEKS CM-9004
## 22 CN_294 14 ALOUETTE RIVER CK-9006
## 23 CN_40043 1210 SHAWNIGAN CREEK CK-9005
## 24 CN_42809 1584 PALLANT CREEK CK-9002
## 25 CN_46042 181 CHEHALIS RIVER CK-9006
## 26 CN_46048 183 WEAVER CREEK CK-9004
## 27 CN_47219 132 GATES RIVER CK-9004
## 28 CN_48009 34 STAVE RIVER CK-9006
## 29 CN_48672 760 CHAPMAN CREEK CK-9007
## 30 CN_49002 792 LANG CREEK CK-9007
## 31 CN_49062 798 SLIAMMON CREEK CK-9007
## 32 CN_49873 697 CAPILANO RIVER CK-9007
## 33 CN_696 62 CHILLIWACK RIVER CK-9008
## 34 PKE_47215 132 GATES RIVER PKE-9005
## 35 SX_1054 2477 FRASER RIVER - ABOVE TETE JAUNE SER-06
## 36 SX_1099 169 GOAT RIVER SER-06
## 37 SX_2804 2037 WILLIAMS LAKE RIVER SEL-03-xx
## 38 SX_3302 3444 SPROAT LAKE SEL-13-xx
## 39 SX_3302 11485 SOMASS-SPROAT-GC SYSTEM SEL-13-xx
## 40 SX_3302 3416 GREAT CENTRAL LAKE SEL-13-xx
## 41 SX_3389 2573 HAWKS CREEK SEL-03-xx
## 42 SX_3443 2466 CARIBOO RIVER - UPPER SEL-06-18
## 43 SX_41615 1388 OWOSSITSA CREEK SEL-13-21
## 44 SX_43980 1872 WHALEN CREEK SEL-18-11
## 45 SX_44732 2638 GLACIER CREEK SEL-06-05
## 46 SX_45103 2451 SETON CHANNELS - UPPER AND LOWER SEL-03-xx
## 47 SX_45524 303 NADINA RIVER SEL-06-04
## 48 SX_45526 304 TAGETOCHLAIN CREEK SEL-06-04
## 49 SX_45537 306 NAHOUNLI CREEK SEL-03-xx
## 50 SX_45538 307 SOWCHEA CREEK SEL-03-xx
## 51 SX_45983 294 WEST ROAD (BLACKWATER) RIVER SER-05
## 52 SX_45993 295 NECHAKO RIVER SER-05
## 53 SX_46038 181 CHEHALIS RIVER SEL-03-xx
## 54 SX_46157 211 THOMPSON RIVER SEL-03-xx
## 55 SX_46208 217 DEADMAN RIVER SEL-03-xx
## 56 SX_46883 290 CARIBOO RIVER SEL-03-xx
## 57 SX_46953 176 SWIFT CREEK SER-06
## 58 SX_46981 62 CHILLIWACK RIVER SEL-03-xx
## 59 SX_47092 104 MARIA SLOUGH SEL-03-xx
## 60 SX_47099 106 WAHLEACH CREEK SEL-03-xx
## 61 SX_47106 108 MAHOOD CREEK SEL-03-xx
## 62 SX_47110 109 RUBY CREEK SEL-03-xx
## 63 SX_47117 111 SILVERHOPE CREEK SEL-03-xx
## 64 SX_47121 112 COQUIHALLA RIVER SEL-03-xx
## 65 SX_47125 113 SUCKER CREEK SEL-03-xx
## 66 SX_47139 118 AMERICAN CREEK SEL-03-xx
## 67 SX_47144 120 EMORY CREEK SEL-03-xx
## 68 SX_47147 122 YALE CREEK SEL-03-xx
## 69 SX_47150 123 SPUZZUM CREEK SEL-03-xx
## 70 SX_47191 7990579 SETON AND CAYOOSH CREEKS SEL-03-xx
## 71 SX_47201 131 CAYOOSH CREEK SEL-03-xx
## 72 SX_47221 133 BRIDGE RIVER SER-05
## 73 SX_47231 134 YALAKOM RIVER SER-05
## 74 SX_47236 138 CHURN CREEK SEL-03-xx
## 75 SX_47439 159 INDIANPOINT CREEK SEL-07-02
## 76 SX_49654 910 PACK LAKE CREEK SEL-11-11
## 77 SX_49808 684 LATIMER CREEK SEL-03-xx
## 78 SX_50840 847 GLENDALE CREEK SEL-11-12
## 79 SX_50841 847 GLENDALE CREEK SEL-11-12
## 80 SX_51610 937 AMBACK CREEK SEL-15-03
## 81 SX_51635 942 GENESEE CREEK SEL-15-03
## 82 SX_52820 1132 CLEAR CREEK SEL-11-10
## 83 SX_7010 21574 BOWRON RIVER - LOWER SEL-03-xx
## 84 SX_7645 2455 BAEZAEKO RIVER SER-05
## 85 SX_7728 133 BRIDGE RIVER SER-05
## 86 SX_7745 158 HAGGEN CREEK SER-06
## CU_NAME cu_name_pse cuid
## 1 INTERIOR FRASER <NA> NA
## 2 INTERIOR FRASER <NA> NA
## 3 INTERIOR FRASER <NA> NA
## 4 INTERIOR FRASER <NA> NA
## 5 INTERIOR FRASER <NA> NA
## 6 INTERIOR FRASER <NA> NA
## 7 INTERIOR FRASER <NA> NA
## 8 INTERIOR FRASER <NA> NA
## 9 INTERIOR FRASER <NA> NA
## 10 INTERIOR FRASER <NA> NA
## 11 INTERIOR FRASER <NA> NA
## 12 INTERIOR FRASER <NA> NA
## 13 INTERIOR FRASER <NA> NA
## 14 INTERIOR FRASER <NA> NA
## 15 INTERIOR FRASER <NA> NA
## 16 INTERIOR FRASER <NA> NA
## 17 INTERIOR FRASER <NA> NA
## 18 INTERIOR FRASER <NA> NA
## 19 INTERIOR FRASER <NA> NA
## 20 INTERIOR FRASER <NA> NA
## 21 INTERIOR FRASER <NA> NA
## 22 FRASER-CROSS-CU SUPPLEMENTATION EXCLUSION <NA> NA
## 23 SOUTH-MISCELLANEOUS <NA> NA
## 24 HATCHERY EXCLUSION-PALLANT CREEK <NA> NA
## 25 FRASER-CROSS-CU SUPPLEMENTATION EXCLUSION <NA> NA
## 26 FRASER-MISCELLANEOUS <NA> NA
## 27 FRASER-MISCELLANEOUS <NA> NA
## 28 FRASER-CROSS-CU SUPPLEMENTATION EXCLUSION <NA> NA
## 29 SOUTHERN BC-CROSS-CU SUPPLEMENTATION EXCLUSION <NA> NA
## 30 SOUTHERN BC-CROSS-CU SUPPLEMENTATION EXCLUSION <NA> NA
## 31 SOUTHERN BC-CROSS-CU SUPPLEMENTATION EXCLUSION <NA> NA
## 32 SOUTHERN BC-CROSS-CU SUPPLEMENTATION EXCLUSION <NA> NA
## 33 FRASER-HARRISON FALL TRANSPLANT_FA_0.3 <NA> NA
## 34 FRASER RIVER <NA> NA
## 35 UPPER FRASER <NA> NA
## 36 UPPER FRASER <NA> NA
## 37 FRASER RIVER MIGRATORY COUNTS <NA> NA
## 38 GREAT CENTRAL/SPROAT <NA> NA
## 39 GREAT CENTRAL/SPROAT <NA> NA
## 40 GREAT CENTRAL/SPROAT <NA> NA
## 41 FRASER RIVER MIGRATORY COUNTS <NA> NA
## 42 CARIBOO-SUMMER TIMING <NA> NA
## 43 OWOSSITSA <NA> NA
## 44 WHALEN <NA> NA
## 45 FRANCOIS-LATE TIMING <NA> NA
## 46 FRASER RIVER MIGRATORY COUNTS <NA> NA
## 47 FRANCOIS-EARLY SUMMER TIMING <NA> NA
## 48 FRANCOIS-EARLY SUMMER TIMING <NA> NA
## 49 FRASER RIVER MIGRATORY COUNTS <NA> NA
## 50 FRASER RIVER MIGRATORY COUNTS <NA> NA
## 51 MIDDLE FRASER <NA> NA
## 52 MIDDLE FRASER <NA> NA
## 53 FRASER RIVER MIGRATORY COUNTS <NA> NA
## 54 FRASER RIVER MIGRATORY COUNTS <NA> NA
## 55 FRASER RIVER MIGRATORY COUNTS <NA> NA
## 56 FRASER RIVER MIGRATORY COUNTS <NA> NA
## 57 UPPER FRASER <NA> NA
## 58 FRASER RIVER MIGRATORY COUNTS <NA> NA
## 59 FRASER RIVER MIGRATORY COUNTS <NA> NA
## 60 FRASER RIVER MIGRATORY COUNTS <NA> NA
## 61 FRASER RIVER MIGRATORY COUNTS <NA> NA
## 62 FRASER RIVER MIGRATORY COUNTS <NA> NA
## 63 FRASER RIVER MIGRATORY COUNTS <NA> NA
## 64 FRASER RIVER MIGRATORY COUNTS <NA> NA
## 65 FRASER RIVER MIGRATORY COUNTS <NA> NA
## 66 FRASER RIVER MIGRATORY COUNTS <NA> NA
## 67 FRASER RIVER MIGRATORY COUNTS <NA> NA
## 68 FRASER RIVER MIGRATORY COUNTS <NA> NA
## 69 FRASER RIVER MIGRATORY COUNTS <NA> NA
## 70 FRASER RIVER MIGRATORY COUNTS <NA> NA
## 71 FRASER RIVER MIGRATORY COUNTS <NA> NA
## 72 MIDDLE FRASER <NA> NA
## 73 MIDDLE FRASER <NA> NA
## 74 FRASER RIVER MIGRATORY COUNTS <NA> NA
## 75 INDIAN/KRUGER-EARLY SUMMER TIMING <NA> NA
## 76 PACK <NA> NA
## 77 FRASER RIVER MIGRATORY COUNTS <NA> NA
## 78 GLENDALE <NA> NA
## 79 GLENDALE <NA> NA
## 80 OWIKENO-LATE TIMING <NA> NA
## 81 OWIKENO-LATE TIMING <NA> NA
## 82 VILLAGE BAY <NA> NA
## 83 FRASER RIVER MIGRATORY COUNTS <NA> NA
## 84 MIDDLE FRASER <NA> NA
## 85 MIDDLE FRASER <NA> NA
## 86 UPPER FRASER <NA> NA
These series are kept in NUSEDS.
Normally, POP_ID and GFE_ID should be a
ONE-TO-ONE relationship (i.e., a given population should be present in
only one location and a CU should have only one population in any given
location). By comparing the times series presenting either of these
these two issues, we can spot clear errors in the data, errors that we
solve by either deleting duplicated data points or merging time
series.
CUs with multiple IndexId/POP_ID associated to a single
GFE_ID.
There are many cases where a CU in a given location (i.e., a unique
GFE_ID) is associated to multiple series
(i.e. IndexId/POP_ID), which should not
happen. Observing these cases reveals clear duplicated data points or
single data point that are not worth keeping. To fix these issues we
proceed as follow:
Case 1: one of the duplicated series has only one data point:
if it is complementary: merge to the other (longer) series
if it is in conflict or a duplicate: remove the focal series
Case 2: the shorter series is 100% duplicated: removed the focal series
Case 3: for the rest of the duplicated series:
points that are conflictual or duplicated are summed up
points that are complementary are merged
By summing data points in a few cases, we assume that the different runs (e.g., “Chinook Run 1” and “Chinook Run 2”) can be considered a single population. For example, the Bridge River has Summer and Late run sockeye surveys, but these are both in the MIDDLE FRASER river-type sockeye CU and are summed to yield the Bridge River survey for that CU in the PSE.
The figures below show the time series before and after correction with the action made (i.e., “MERGED”, “DELETED”, “SUMMED”) indicated.
There are a few instances where a special fix is done, as indicated below in the corresponding figures.
In the few instances where data points are summed, we define the
ESTIMATE_CLASSIFICATION (e.g., “RELATIVE ABUNDANCE
(TYPE-3)”) as the value corresponding to the highest
MAX_ESTIMATE value between the two data points. In these
cases, a table follows the figure and show the data points that are
summed with their corresponding Year and
ESTIMATE_CLASSIFICATION.
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_275 220 CK-13 313
## 2 CN_46246 220 CK-13 313
## [1] "Special fix (above): the two red points are deleted; one is a duplicate, the other one has 'UNKNOWN' ESTIMATE_CLASSIFICATION:"
## GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1 ESTIMATE_CLASSIFICATION_s1 IndexId_s2
## 1 220 2003 CN_275 150 UNKNOWN CN_46246
## 2 220 2006 CN_275 6344 RELATIVE ABUNDANCE (TYPE-4) CN_46246
## MAX_ESTIMATE_s2 ESTIMATE_CLASSIFICATION_s2
## 1 1920 RELATIVE ABUNDANCE (TYPE-4)
## 2 6344 RELATIVE ABUNDANCE (TYPE-4)
## [1] "***"
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_297 14 CM-02 701
## 2 CM_47925 14 CM-02 701
## IndexId GFE_ID FULL_CU_IN cuid
## 1 SX_431 252 SEL-09-03 739
## 2 SX_7635 252 SEL-09-03 739
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_1251 67 CM-02 701
## 2 CM_47009 67 CM-02 701
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_1416 312 CK-11 311
## 2 CN_45555 312 CK-11 311
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_1720 241 CO-08 708
## 2 CO_46441 241 CO-08 708
## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
## GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1 ESTIMATE_CLASSIFICATION_s1 IndexId_s2
## 1 241 1998 CO_1720 309 UNKNOWN CO_46441
## MAX_ESTIMATE_s2 ESTIMATE_CLASSIFICATION_s2
## 1 206 UNKNOWN
## [1] "***"
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_2182 2476 CO-48 749
## 2 CO_44539 2476 CO-48 749
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_2318 167 CK-12 312
## 2 CN_47519 167 CK-12 312
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_2483 142 CK-10 310
## 2 CN_47277 142 CK-10 310
## 3 CN_47278 142 CK-10 310
## [1] "we can see in the last figure above that there is another time series (CN_47277 - 2464) of the same CU associated to the COTTONWOOD RIVER, which matches the POPULATION of both CN_47277 and CN_47278."
## [1] "**DECISIONS:**"
## [1] "We merge the series CN_47277 - 142 to the series CN_47277 - 2464 by changing its GFE_ID (and associated fields) from 142 to 2464."
## [1] "The one data point of series CN_47278 - 142 is discarded because it is in conflict with series CN_47277 - 2464 (plus ESTIMATE_METHOD = 'Area Under the Curve' vs. 'Peak Live * Expansion')."
## [1] "After corrections:"
## [1] "24 rows were edited in all_areas_nuseds at the following fields: AREA, WATERBODY, GAZETTED_NAME, LOCAL_NAME_1, LOCAL_NAME_2, WATERSHED_CDE, WATERBODY_ID, GFE_ID, FWA_WATERSHED_CDE"
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_3136 2243 CO-22 518
## 2 CO_7776 2243 CO-22 518
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_3306 11486 CK-31 328
## 2 CN_3324 11486 CK-31 328
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_3306 11487 CK-31 328
## 2 CN_3320 11487 CK-31 328
## [1] "The POPULATION of series CN_3306 - 11487 does not align well with WATERBODY (1st plot above)."
## [1] "Additonally, IndexId/POP_ID CN_3306 is associated to multiple GFE_ID (plot above)."
## [1] "**DECISIONS:**"
## [1] "Merge CN_3306 - 11487 to CN_3320 - 11487 in NUSEDS."
## [1] "***"
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_3331 1156 CK-27 325
## 2 CN_53069 1156 CK-27 325
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_3333 1194 CK-83 334
## 2 CN_51518 1194 CK-83 334
## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
## GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1 ESTIMATE_CLASSIFICATION_s1 IndexId_s2
## 1 1194 2019 CN_3333 3 RELATIVE ABUNDANCE (TYPE-5) CN_51518
## 2 1194 2020 CN_3333 9 RELATIVE ABUNDANCE (TYPE-5) CN_51518
## 3 1194 2021 CN_3333 5 RELATIVE ABUNDANCE (TYPE-5) CN_51518
## 4 1194 2022 CN_3333 16 RELATIVE ABUNDANCE (TYPE-5) CN_51518
## 5 1194 2023 CN_3333 14 RELATIVE ABUNDANCE (TYPE-5) CN_51518
## MAX_ESTIMATE_s2 ESTIMATE_CLASSIFICATION_s2
## 1 267 RELATIVE ABUNDANCE (TYPE-5)
## 2 722 RELATIVE ABUNDANCE (TYPE-4)
## 3 992 TRUE ABUNDANCE (TYPE-2)
## 4 417 RELATIVE ABUNDANCE (TYPE-3)
## 5 590 RELATIVE ABUNDANCE (TYPE-3)
## [1] "***"
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_3334 1194 CK-25 324
## 2 CN_51519 1194 CK-25 324
## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
## GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1 ESTIMATE_CLASSIFICATION_s1
## 30 1194 1984 CN_3334 824 RELATIVE ABUNDANCE (TYPE-5)
## 32 1194 1989 CN_3334 483 RELATIVE ABUNDANCE (TYPE-5)
## 33 1194 1990 CN_3334 233 RELATIVE ABUNDANCE (TYPE-5)
## 35 1194 1993 CN_3334 638 RELATIVE ABUNDANCE (TYPE-5)
## 36 1194 1994 CN_3334 437 RELATIVE ABUNDANCE (TYPE-5)
## IndexId_s2 MAX_ESTIMATE_s2 ESTIMATE_CLASSIFICATION_s2
## 30 CN_51519 2423 RELATIVE ABUNDANCE (TYPE-5)
## 32 CN_51519 2550 RELATIVE ABUNDANCE (TYPE-5)
## 33 CN_51519 1720 RELATIVE ABUNDANCE (TYPE-5)
## 35 CN_51519 740 RELATIVE ABUNDANCE (TYPE-5)
## 36 CN_51519 940 RELATIVE ABUNDANCE (TYPE-5)
## [1] "***"
## [1] "note that the POPULATION of CN_3334 (above) shows 'unkown timing', however the field END_SPAWN_DT_FROM shows that the end of the spawning season is in the FALL, which corresonds to the correct CU:"
## IndexId FULL_CU_IN GFE_ID WATERBODY END_SPAWN_DT_FROM
## 1 CN_3334 1194 NANAIMO RIVER 14-OCT-54
## 2 CN_3334 1194 NANAIMO RIVER 15-NOV-64
## 3 CN_3334 1194 NANAIMO RIVER 30-OCT-55
## 4 CN_3334 1194 NANAIMO RIVER 15-SEP-59
## 5 CN_3334 1194 NANAIMO RIVER 28-OCT-72
## 6 CN_3334 1194 NANAIMO RIVER 15-NOV-63
## 7 CN_3334 1194 NANAIMO RIVER 15-NOV-67
## 8 CN_3334 1194 NANAIMO RIVER 01-NOV-89
## 9 CN_3334 1194 NANAIMO RIVER
## 10 CN_3334 1194 NANAIMO RIVER 20-OCT-71
## 11 CN_3334 1194 NANAIMO RIVER 15-DEC-61
## 12 CN_3334 1194 NANAIMO RIVER 10-OCT-58
## 13 CN_3334 1194 NANAIMO RIVER 15-NOV-60
## 14 CN_3334 1194 NANAIMO RIVER 30-NOV-78
## 15 CN_3334 1194 NANAIMO RIVER 07-OCT-69
## 16 CN_3334 1194 NANAIMO RIVER 16-NOV-75
## 17 CN_3334 1194 NANAIMO RIVER 15-DEC-68
## 18 CN_3334 1194 NANAIMO RIVER 15-OCT-53
## 19 CN_3334 1194 NANAIMO RIVER 15-NOV-66
## 20 CN_3334 1194 NANAIMO RIVER 15-DEC-62
## 21 CN_3334 1194 NANAIMO RIVER 15-OCT-70
## 22 CN_3334 1194 NANAIMO RIVER 10-NOV-57
## 23 CN_3334 1194 NANAIMO RIVER 10-NOV-73
## 24 CN_3334 1194 NANAIMO RIVER 15-NOV-74
## 25 CN_3334 1194 NANAIMO RIVER 15-NOV-65
## 26 CN_3334 1194 NANAIMO RIVER 01-NOV-76
## 27 CN_3334 1194 NANAIMO RIVER 30-SEP-56
## 28 CN_3334 1194 NANAIMO RIVER 22-NOV-77
## [1] "***"
## IndexId GFE_ID FULL_CU_IN cuid
## 1 SX_7728 133 SER-05 NA
## 2 SX_47221 133 SER-05 NA
## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
## GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1 ESTIMATE_CLASSIFICATION_s1 IndexId_s2
## 1 133 2013 SX_7728 25 RELATIVE ABUNDANCE (TYPE-4) SX_47221
## 2 133 2014 SX_7728 68 RELATIVE ABUNDANCE (TYPE-3) SX_47221
## 3 133 2016 SX_7728 0 RELATIVE ABUNDANCE (TYPE-4) SX_47221
## 4 133 2017 SX_7728 0 RELATIVE ABUNDANCE (TYPE-3) SX_47221
## 5 133 2018 SX_7728 0 RELATIVE ABUNDANCE (TYPE-3) SX_47221
## 6 133 2019 SX_7728 4 RELATIVE ABUNDANCE (TYPE-4) SX_47221
## 7 133 2022 SX_7728 0 RELATIVE ABUNDANCE (TYPE-4) SX_47221
## MAX_ESTIMATE_s2 ESTIMATE_CLASSIFICATION_s2
## 1 2131 RELATIVE ABUNDANCE (TYPE-4)
## 2 502 RELATIVE ABUNDANCE (TYPE-3)
## 3 61 RELATIVE ABUNDANCE (TYPE-3)
## 4 158 RELATIVE ABUNDANCE (TYPE-3)
## 5 88 RELATIVE ABUNDANCE (TYPE-3)
## 6 121 RELATIVE ABUNDANCE (TYPE-4)
## 7 4844 RELATIVE ABUNDANCE (TYPE-4)
## [1] "***"
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_7809 442 CK-01 301
## 2 CN_48442 442 CK-01 301
## IndexId GFE_ID FULL_CU_IN cuid
## 1 SX_7823 298 SEL-06-06 753
## 2 SX_46023 298 SEL-06-06 753
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_39896 1449 CM-11 977
## 2 CM_39897 1449 CM-11 977
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_40023 1208 CK-22 322
## 2 CN_40024 1208 CK-22 322
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_40548 1261 CK-31 328
## 2 CN_40549 1261 CK-31 328
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_41342 1358 CO-17 914
## 2 CO_41343 1358 CO-17 914
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_42331 1482 CM-11 977
## 2 CM_42332 1482 CM-11 977
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_44539 129 CO-48 749
## 2 CO_47183 129 CO-48 749
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_44539 2451 CO-48 749
## 2 CO_47183 2451 CO-48 749
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_44736 2453 CO-07 707
## 2 CO_46170 2453 CO-07 707
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_44772 2464 CK-10 310
## 2 CN_47277 2464 CK-10 310
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_45016 2392 CO-08 708
## 2 CO_45017 2392 CO-08 708
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_45164 2224 CK-63 1006
## 2 CN_45165 2224 CK-63 1006
## IndexId GFE_ID FULL_CU_IN cuid
## 1 SX_45182 1327 SER-10 972
## 2 SX_52310 1327 SER-10 972
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_46186 214 CK-17 317
## 2 CN_46187 214 CK-17 317
## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
## GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1 ESTIMATE_CLASSIFICATION_s1
## 1 214 1999 CN_46187 432 RELATIVE: VARYING MULTI-YEAR METHODS
## IndexId_s2 MAX_ESTIMATE_s2 ESTIMATE_CLASSIFICATION_s2
## 1 CN_46186 114 RELATIVE ABUNDANCE (TYPE-4)
## [1] "***"
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_46196 215 CK-17 317
## 2 CN_46197 215 CK-17 317
## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
## GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1 ESTIMATE_CLASSIFICATION_s1
## 1 215 1999 CN_46197 237 RELATIVE: VARYING MULTI-YEAR METHODS
## IndexId_s2 MAX_ESTIMATE_s2 ESTIMATE_CLASSIFICATION_s2
## 1 CN_46196 208 RELATIVE ABUNDANCE (TYPE-4)
## [1] "***"
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_46240 221 CO-08 708
## 2 CO_46250 221 CO-08 708
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_46602 2746 CO-09 709
## 2 CO_46632 2746 CO-09 709
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_46961 176 CK-12 312
## 2 CN_46962 176 CK-12 312
## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
## GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1 ESTIMATE_CLASSIFICATION_s1
## 1 176 1998 CN_46962 2141 RELATIVE: CONSTANT MULTI-YEAR METHODS
## 2 176 2008 CN_46962 421 RELATIVE ABUNDANCE (TYPE-4)
## IndexId_s2 MAX_ESTIMATE_s2 ESTIMATE_CLASSIFICATION_s2
## 1 CN_46961 1098 RELATIVE ABUNDANCE (TYPE-4)
## 2 CN_46961 422 RELATIVE ABUNDANCE (TYPE-4)
## [1] "***"
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_47925 15 CM-02 701
## 2 CM_47928 15 CM-02 701
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_47925 17 CM-02 701
## 2 CM_47939 17 CM-02 701
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_49080 800 CM-04 904
## 2 CM_49081 800 CM-04 904
## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
## GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1 ESTIMATE_CLASSIFICATION_s1 IndexId_s2
## 1 800 2003 CM_49081 1877 RELATIVE ABUNDANCE (TYPE-4) CM_49080
## MAX_ESTIMATE_s2 ESTIMATE_CLASSIFICATION_s2
## 1 2603 RELATIVE ABUNDANCE (TYPE-5)
## [1] "***"
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_49516 896 CO-12 909
## 2 CO_49517 896 CO-12 909
## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
## GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1 ESTIMATE_CLASSIFICATION_s1 IndexId_s2
## 1 896 1995 CO_49517 200 UNKNOWN CO_49516
## 2 896 1996 CO_49517 100 UNKNOWN CO_49516
## MAX_ESTIMATE_s2 ESTIMATE_CLASSIFICATION_s2
## 1 400 UNKNOWN
## 2 800 UNKNOWN
## [1] "***"
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_49520 896 CM-08 900
## 2 CM_49521 896 CM-08 900
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_49526 897 CO-12 909
## 2 CO_49527 897 CO-12 909
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_49546 899 CO-12 909
## 2 CO_49547 899 CO-12 909
## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
## GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1 ESTIMATE_CLASSIFICATION_s1 IndexId_s2
## 1 899 1994 CO_49547 600 UNKNOWN CO_49546
## 2 899 1996 CO_49547 350 UNKNOWN CO_49546
## 3 899 1997 CO_49547 50 UNKNOWN CO_49546
## MAX_ESTIMATE_s2 ESTIMATE_CLASSIFICATION_s2
## 1 300 UNKNOWN
## 2 200 UNKNOWN
## 3 100 UNKNOWN
## [1] "***"
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_49640 908 CM-08 900
## 2 CM_49641 908 CM-08 900
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_49896 704 CO-12 909
## 2 CO_49897 704 CO-12 909
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_49935 710 CO-10 907
## 2 CO_49936 710 CO-10 907
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_50536 816 CM-07 901
## 2 CM_50537 816 CM-07 901
## [1] "Special fix (above): the blue points before 1950 are deleted, the other ones are summed to the red series."
## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
## GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1 ESTIMATE_CLASSIFICATION_s1
## 1 816 1934 CM_50537 75000 UNKNOWN
## 2 816 1935 CM_50537 7500 UNKNOWN
## 3 816 1936 CM_50537 15000 UNKNOWN
## 4 816 1937 CM_50537 75000 UNKNOWN
## 5 816 1939 CM_50537 7500 UNKNOWN
## 6 816 1940 CM_50537 35000 UNKNOWN
## 7 816 1942 CM_50537 100000 UNKNOWN
## 8 816 1943 CM_50537 75000 UNKNOWN
## 9 816 1944 CM_50537 35000 UNKNOWN
## 10 816 1995 CM_50537 3000 RELATIVE ABUNDANCE (TYPE-5)
## 11 816 1996 CM_50537 2200 RELATIVE ABUNDANCE (TYPE-5)
## 12 816 1997 CM_50537 1000 RELATIVE ABUNDANCE (TYPE-5)
## 13 816 1998 CM_50537 1000 RELATIVE ABUNDANCE (TYPE-5)
## 14 816 2000 CM_50537 200 RELATIVE ABUNDANCE (TYPE-5)
## 15 816 2002 CM_50537 2012 PRESENCE-ABSENCE (TYPE-6)
## 16 816 2003 CM_50537 215 RELATIVE ABUNDANCE (TYPE-5)
## 17 816 2005 CM_50537 107 RELATIVE ABUNDANCE (TYPE-5)
## ESTIMATE_METHOD_s1 IndexId_s2 MAX_ESTIMATE_s2
## 1 Unknown Estimate Method CM_50536 75000
## 2 Unknown Estimate Method CM_50536 7500
## 3 Unknown Estimate Method CM_50536 15000
## 4 Unknown Estimate Method CM_50536 75000
## 5 Unknown Estimate Method CM_50536 7500
## 6 Unknown Estimate Method CM_50536 35000
## 7 Unknown Estimate Method CM_50536 100000
## 8 Unknown Estimate Method CM_50536 75000
## 9 Unknown Estimate Method CM_50536 35000
## 10 Expert Opinion CM_50536 27000
## 11 Expert Opinion CM_50536 2300
## 12 Expert Opinion CM_50536 6000
## 13 Expert Opinion CM_50536 4000
## 14 Expert Opinion CM_50536 6800
## 15 Peak Live + Dead CM_50536 20567
## 16 Peak Live + Dead CM_50536 71023
## 17 Peak Live + Dead CM_50536 2354
## ESTIMATE_CLASSIFICATION_s2 ESTIMATE_METHOD_s2
## 1 UNKNOWN Unknown Estimate Method
## 2 UNKNOWN Unknown Estimate Method
## 3 UNKNOWN Unknown Estimate Method
## 4 UNKNOWN Unknown Estimate Method
## 5 UNKNOWN Unknown Estimate Method
## 6 UNKNOWN Unknown Estimate Method
## 7 UNKNOWN Unknown Estimate Method
## 8 UNKNOWN Unknown Estimate Method
## 9 UNKNOWN Unknown Estimate Method
## 10 RELATIVE ABUNDANCE (TYPE-5) Expert Opinion
## 11 RELATIVE ABUNDANCE (TYPE-5) Expert Opinion
## 12 RELATIVE ABUNDANCE (TYPE-5) Expert Opinion
## 13 RELATIVE ABUNDANCE (TYPE-5) Expert Opinion
## 14 RELATIVE ABUNDANCE (TYPE-5) Expert Opinion
## 15 PRESENCE-ABSENCE (TYPE-6) Peak Live + Dead
## 16 RELATIVE ABUNDANCE (TYPE-3) Area Under the Curve
## 17 RELATIVE ABUNDANCE (TYPE-3) Area Under the Curve
## [1] "***"
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_50546 817 CM-07 901
## 2 CM_50547 817 CM-07 901
## IndexId GFE_ID FULL_CU_IN cuid
## 1 SX_50610 824 SEL-11-06 932
## 2 SX_50611 824 SEL-11-06 932
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_50618 824 CK-28 326
## 2 CN_50619 824 CK-28 326
## [1] "Special fix (above): the two blue points are deleted; one is a duplicate, the other one is very close to be one."
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_50712 834 CO-12 909
## 2 CO_50713 834 CO-12 909
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_50736 836 CM-06 902
## 2 CM_50737 836 CM-06 902
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_50832 846 CO-12 909
## 2 CO_50833 846 CO-12 909
## IndexId GFE_ID FULL_CU_IN cuid
## 1 SX_50840 847 SEL-11-12 NA
## 2 SX_50841 847 SEL-11-12 NA
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_50872 850 CO-19 916
## 2 CO_50873 850 CO-19 916
## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
## GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1 ESTIMATE_CLASSIFICATION_s1 IndexId_s2
## 1 850 1992 CO_50873 100 UNKNOWN CO_50872
## 2 850 1993 CO_50873 60 UNKNOWN CO_50872
## MAX_ESTIMATE_s2 ESTIMATE_CLASSIFICATION_s2
## 1 250 UNKNOWN
## 2 211 UNKNOWN
## [1] "***"
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CN_50878 850 CK-35 332
## 2 CN_50879 850 CK-35 332
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_50892 852 CO-12 909
## 2 CO_50893 852 CO-12 909
## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
## GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1 ESTIMATE_CLASSIFICATION_s1 IndexId_s2
## 1 852 1993 CO_50893 1370 UNKNOWN CO_50892
## 2 852 1995 CO_50893 420 UNKNOWN CO_50892
## MAX_ESTIMATE_s2 ESTIMATE_CLASSIFICATION_s2
## 1 28 UNKNOWN
## 2 220 UNKNOWN
## [1] "***"
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_50902 853 CO-12 909
## 2 CO_50903 853 CO-12 909
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_50956 858 CM-08 900
## 2 CM_50957 858 CM-08 900
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_50972 860 CO-12 909
## 2 CO_50973 860 CO-12 909
## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
## GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1 ESTIMATE_CLASSIFICATION_s1 IndexId_s2
## 2 860 1995 CO_50973 150 UNKNOWN CO_50972
## MAX_ESTIMATE_s2 ESTIMATE_CLASSIFICATION_s2
## 2 392 UNKNOWN
## [1] "***"
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_50976 860 CM-08 900
## 2 CM_50977 860 CM-08 900
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_50986 861 CM-08 900
## 2 CM_50987 861 CM-08 900
## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
## GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1 ESTIMATE_CLASSIFICATION_s1 IndexId_s2
## 1 861 1995 CM_50987 1000 UNKNOWN CM_50986
## 2 861 1996 CM_50987 350 UNKNOWN CM_50986
## MAX_ESTIMATE_s2 ESTIMATE_CLASSIFICATION_s2
## 1 800 UNKNOWN
## 2 450 UNKNOWN
## [1] "***"
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_51002 863 CO-12 909
## 2 CO_51003 863 CO-12 909
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_51036 866 CM-08 900
## 2 CM_51037 866 CM-08 900
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_51042 867 CO-12 909
## 2 CO_51043 867 CO-12 909
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_51092 872 CO-12 909
## 2 CO_51093 872 CO-12 909
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_51096 872 CM-08 900
## 2 CM_51097 872 CM-08 900
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_51112 874 CO-12 909
## 2 CO_51113 874 CO-12 909
## [1] "Below is MAX_ESTIMATE and ESTIMATE_CLASSIFICATION for the data points summed:"
## GFE_ID Year IndexId_s1 MAX_ESTIMATE_s1 ESTIMATE_CLASSIFICATION_s1 IndexId_s2
## 1 874 1997 CO_51113 100 UNKNOWN CO_51112
## 2 874 1998 CO_51113 9500 UNKNOWN CO_51112
## MAX_ESTIMATE_s2 ESTIMATE_CLASSIFICATION_s2
## 1 25 UNKNOWN
## 2 3700 UNKNOWN
## [1] "***"
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_51147 878 CO-12 909
## 2 CO_51148 878 CO-12 909
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_52512 1100 CO-15 912
## 2 CO_52513 1100 CO-15 912
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_52522 1101 CO-15 912
## 2 CO_52523 1101 CO-15 912
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_52532 1102 CO-15 912
## 2 CO_52533 1102 CO-15 912
## IndexId GFE_ID FULL_CU_IN cuid
## 1 SX_52625 1112 SEL-12-03 939
## 2 SX_52626 1112 SEL-12-03 939
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_52627 1112 CO-14 911
## 2 CO_52628 1112 CO-14 911
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_52661 1115 CM-05 903
## 2 CM_52662 1115 CM-05 903
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_52672 1117 CO-14 911
## 2 CO_52673 1117 CO-14 911
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_52736 1123 CM-05 903
## 2 CM_52737 1123 CM-05 903
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_52792 1129 CO-11 908
## 2 CO_52793 1129 CO-11 908
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CO_52802 1130 CO-11 908
## 2 CO_52803 1130 CO-11 908
## IndexId GFE_ID FULL_CU_IN cuid
## 1 CM_52866 1136 CM-06 902
## 2 CM_52867 1136 CM-06 902
We now look at the case where a single IndexId/POP_ID is
associated to multiple GFE_ID and make decision on a case
by case basis.
## [1] "*** i = 1 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 945 CM_47925 14 CM-02 701
## 154590 CM_47925 31516 CM-02 701
## 154598 CM_47925 15 CM-02 701
## 154606 CM_47925 31740 CM-02 701
## [1] "In case i = 1 we leave the data as is."
## [1] "*** i = 2 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 1410 CN_2178 2476 CK-11 311
## 1460 CN_2178 129 CK-11 311
## [1] "In case i = 2 above, the series CN_2178 - 129 is a duplicate of the series CN_47189 - 129 as we can see below:"
## [1] "The series is consequently removed from NUSEDS."
## [1] "Number of rows removed from dataframe = 5"
## [1] "*** i = 3 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 3720 SX_3302 3444 SEL-13-xx NA
## 3743 SX_3302 11485 SEL-13-xx NA
## 3774 SX_3302 3416 SEL-13-xx NA
## [1] "In case i = 3 we see that there are multiple time series that could potentially be merged (see both above and below); we however do not make any change."
## [1] "*** i = 4 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 3797 CO_3303 11488 CO-17 914
## 3819 CO_3303 11487 CO-17 914
## 3841 CO_3303 11485 CO-17 914
## [1] "In case i = 4 we leave the data as is."
## [1] "*** i = 5 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 3872 CM_3305 11486 CM-10 978
## 3893 CM_3305 11485 CM-10 978
## 3925 CM_3305 11487 CM-10 978
## 3947 CM_3305 11488 CM-10 978
## [1] "In case i = 5 we leave the data as is."
## [1] "*** i = 6 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 3969 CN_3306 11486 CK-31 328
## 4007 CN_3306 11485 CK-31 328
## 4057 CN_3306 11488 CK-31 328
## [1] "In case i = 6 we leave the data as is."
## [1] "*** i = 7 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 5280 SX_3416 2746 SEL-10-03 752
## 5345 SX_3416 261 SEL-10-03 752
## [1] "In case i = 7 we merge series SX_3416 - 2746 to series SX_3416 - 261 because the locations have the same coordinates:"
## CENSUS_SITE GFE_ID X_LONGT Y_LAT
## 1 FENNELL CREEK 261 -119.7232685 51.3525769
## 2 FENNELL CREEK AND SASKUM CREEK 2746 -119.7232685 51.3525769
## [1] "Note that no data point are in conflict."
## [1] "65 rows were edited in all_areas_nuseds at the following fields: AREA, WATERBODY, GAZETTED_NAME, LOCAL_NAME_1, LOCAL_NAME_2, WATERSHED_CDE, WATERBODY_ID, GFE_ID, FWA_WATERSHED_CDE"
## [1] "*** i = 8 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 111158 CO_44539 2451 CO-48 749
## 111173 CO_44539 2476 CO-48 749
## [1] "In case i = 8 we leave the data as is."
## [1] "*** i = 9 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 111403 CN_44577 2492 CK-76 1207
## 111405 CN_44577 2518 CK-76 1207
## [1] "In case i = 9 there is a clear discrepancy between the series CN_44577 - 2518's POPULATION (Chandindu River) and it's WATERBODY (Wolf River) as the two rivers are very far apart."
## [1] "But it seems that the data point in fact belongs to the series CN_44599 - 2518:"
## [1] "We decide to merge the data point to CN_44599 - 2518:"
## [1] "1 rows were edited in all_areas_nuseds at the following fields: SPECIES, POPULATION, RUN_TYPE, POP_ID, FULL_CU_IN, species_acronym_ncc, IndexId, CU_NAME, cu_name_pse, cuid"
## [1] "*** i = 10 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 116881 SX_45141 2193 SER-22 1023
## 116916 SX_45141 2119 SER-22 1023
## [1] "In case i = 10 we leave the data as is."
## [1] "*** i = 11 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 119811 SX_45525 2444 SEL-06-20 727
## 119863 SX_45525 303 SEL-06-20 727
## [1] "In case i = 11 we leave the data as is."
## [1] "*** i = 12 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 129320 CO_46170 54283 CO-07 707
## 129329 CO_46170 54282 CO-07 707
## 129338 CO_46170 2461 CO-07 707
## 129345 CO_46170 213 CO-07 707
## [1] "In case i = 12 we leave the data as is."
## [1] "*** i = 13 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 130308 CO_46240 220 CO-08 708
## 130314 CO_46240 3477 CO-08 708
## [1] "In case i = 13 we leave the data as is."
## [1] "*** i = 14 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 134174 CO_46582 33045 CO-09 709
## 134198 CO_46582 1754239945 CO-09 709
## 134216 CO_46582 2049268617 CO-09 709
## 134239 CO_46582 256 CO-09 709
## 134265 CO_46582 719256264 CO-09 709
## 134289 CO_46582 57041 CO-09 709
## 134296 CO_46582 63379 CO-09 709
## [1] "In case i = 14 we leave the data as is."
## [1] "*** i = 15 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 134577 CO_46602 258 CO-09 709
## 134611 CO_46602 212716981 CO-09 709
## [1] "In case i = 15 we leave the data as is."
## [1] "*** i = 16 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 134634 CO_46632 2746 CO-09 709
## 134831 CO_46632 261 CO-09 709
## [1] "In case i = 16 we merge series CO_46632 - 2746 to series CO_46632 - 261 because the locations have the same coordinates:"
## CENSUS_SITE GFE_ID X_LONGT Y_LAT
## 1 FENNELL CREEK 261 -119.7232685 51.3525769
## 2 FENNELL CREEK AND SASKUM CREEK 2746 -119.7232685 51.3525769
## [1] "Note that no data point are in conflict."
## [1] "36 rows were edited in all_areas_nuseds at the following fields: AREA, WATERBODY, GAZETTED_NAME, LOCAL_NAME_1, LOCAL_NAME_2, WATERSHED_CDE, WATERBODY_ID, GFE_ID, FWA_WATERSHED_CDE"
## [1] "*** i = 17 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 136162 CO_46795 281 CO-09 709
## 136194 CO_46795 33103 CO-09 709
## 136210 CO_46795 1921661712 CO-09 709
## [1] "In case i = 17 we leave the data as is."
## [1] "*** i = 18 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 136226 CN_46801 281 CK-18 318
## 136276 CN_46801 33103 CK-18 318
## [1] "In case i = 18 we delete the data point of series CN_46801 - 33103 because it is almost a duplicate:"
## IndexId GFE_ID WATERBODY Year MAX_ESTIMATE
## 1 CN_46801 281 BLUE RIVER 2006 212
## 2 CN_46801 33103 BLUE RIVER - LOWER 2006 214
## [1] "*** i = 19 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 136410 CO_46835 285 CO-48 749
## 136415 CO_46835 2463 CO-48 749
## [1] "In case i = 19 we merge the data point of series CO_46835 - 285 to series CO_46835 - 2463. Note that the two locations have the same coordinates:"
## CENSUS_SITE GFE_ID Y_LAT X_LONGT
## 1 CHILCOTIN RIVER - LOWER 2463 51.73995282 -122.4013093
## 2 CHILCOTIN RIVER 285 51.73995282 -122.4013093
## [1] "5 rows were edited in all_areas_nuseds at the following fields: AREA, WATERBODY, GAZETTED_NAME, LOCAL_NAME_1, LOCAL_NAME_2, WATERSHED_CDE, WATERBODY_ID, GFE_ID, FWA_WATERSHED_CDE"
## [1] "*** i = 20 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 136419 CN_46841 2462 CK-10 310
## 136449 CN_46841 2454 CK-10 310
## 136455 CN_46841 285 CK-10 310
## [1] "In case i = 20 we leave the data as is."
## [1] "*** i = 21 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 137022 CN_46891 290 CK-11 311
## 137072 CN_46891 2466 CK-11 311
## 137077 CN_46891 2467 CK-11 311
## [1] "In case i = 21 we mcan see that series CN_46891 - 2467 is a duplicate of series CN_46891 - 290, it is consequently deleted."
## [1] "We leave series CN_46891 - 2466."
## [1] "Number of rows removed from dataframe = 5"
## [1] "*** i = 22 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 137082 CN_46892 290 CK-10 310
## 137091 CN_46892 2466 CK-10 310
## [1] "In case i = 22 series CN_46892 - 290 has three duplucated data points with series CN_46892 - 2466."
## [1] "In addition, one data point of the series seem to be a duplicate of the series CN_46891 - 290:"
## [1] "DECISION: we merge CN_46892 - 290 to series CN_46892 - 2466 after deleting the data points that are duplicated in both series CN_46892 - 2466 and CN_46891 - 290."
## [1] "*** i = 23 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 138440 CM_46984 62 CM-02 701
## 138487 CM_46984 2529 CM-02 701
## [1] "In case i = 23 we leave the data as is."
## [1] "*** i = 24 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 143978 SX_47419 157 SEL-07-01 735
## 144065 SX_47419 21574 SEL-07-01 735
## [1] "In case i = 24 we leave the data as is."
## [1] "*** i = 25 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 153806 CM_47906 3 CM-02 701
## 153838 CM_47906 441 CM-02 701
## [1] "In case i = 25 we leave the data as is."
## [1] "*** i = 26 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 153840 CM_47907 7 CM-02 701
## 153882 CM_47907 5 CM-02 701
## [1] "In case i = 26 we leave the data as is."
## [1] "*** i = 27 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 155166 SX_47954 21 SER-02 742
## 155241 SX_47954 54281 SER-02 742
## [1] "In case i = 27 we leave the data as is."
## [1] "*** i = 28 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 166741 CN_48448 443 CK-48 210
## 166785 CN_48448 19723 CK-48 210
## [1] "In case i = 28 we leave the data as is."
## [1] "*** i = 29 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 207392 SX_50610 824 SEL-11-06 932
## 207475 SX_50610 2618 SEL-11-06 932
## [1] "In case i = 29 we leave the data as is."
## [1] "*** i = 30 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 207498 CO_50612 824 CO-12 909
## 207582 CO_50612 2618 CO-12 909
## [1] "In case i = 30 we leave the data as is."
## [1] "*** i = 31 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 207597 CM_50616 824 CM-06 902
## 207682 CM_50616 2618 CM-06 902
## [1] "In case i = 31 we leave the data as is."
## [1] "*** i = 32 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 207697 CN_50618 824 CK-28 326
## 207782 CN_50618 2618 CK-28 326
## [1] "In case i = 32 we leave the data as is."
## [1] "*** i = 33 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 255633 PKE_3304 11485 PKE-02 918
## 255646 PKE_3304 11486 PKE-02 918
## [1] "In case i = 33 we leave the data as is."
## [1] "*** i = 34 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 279914 PKE_50614 2618 PKE-04 920
## 279921 PKE_50614 824 PKE-04 920
## [1] "In case i = 34 we leave the data as is."
## [1] "*** i = 35 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 287258 PKO_3304 11485 PKO-06 925
## 287271 PKO_3304 11486 PKO-06 925
## [1] "In case i = 35 we leave the data as is."
## [1] "*** i = 36 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 309173 PKO_50614 824 PKO-07 926
## 309215 PKO_50614 2618 PKO-07 926
## [1] "In case i = 36 we leave the data as is."
## [1] "*** i = 37 ***"
## IndexId GFE_ID FULL_CU_IN cuid
## 310413 PKO_51094 872 PKO-08 610
## 310456 PKO_51094 489440637 PKO-08 610
## [1] "In case i = 37 we merge the series PKO_51094 - 489440637 to series PKO_51094 - 872 because we can assume that the two locations are in fact the same according to the WATERBODY field and the fact that no coordinates were availale for 'Atlatzi+Clear' Rivers (we defined the them manually higher in the script)."
## [1] "1 rows were edited in all_areas_nuseds at the following fields: AREA, WATERBODY, GAZETTED_NAME, LOCAL_NAME_1, LOCAL_NAME_2, WATERSHED_CDE, WATERBODY_ID, GFE_ID, FWA_WATERSHED_CDE"
PSF formed a technical working group (TWG) specifically to compile data for the Transboundary region (cf. transboundary-data). As part of the work the following modifications were requested:
POP_ID = 45144: Revise name of sockeye survey from
Tahltan River to Tahltan Lake (Jason Calvert, pers. comm. January 9,
2025)
TATSAMENIE RIVER coho (POP_ID = 45152) for 1994 and
earlier changed to TATSATUA RIVER (POP_ID = 45154) and
remaining records 1995+ are removed
any records of POP_ID = 45151 (Tatsamenie River
lake-type sockeye) for 1994 or earlier get changed to
POP_ID = 45153 (TATSATUA RIVER river-type sockeye)
one record with POP_ID = 45165 (Chinook Run 2)
change to 45164 (Chinook Run 1) in Nahlin river
We execute and document these changes below.
POP_ID = 45144: Revise name of sockeye survey from
Tahltan River to Tahltan Lake (Jason Calvert, pers. comm. January 9,
2025)## IndexId cuid POPULATION WATERBODY
## 1 SX_45144 1026 Tahltan Lake (Stikine) Sockeye TAHLTAN LAKE
The name was already changed in NuSEDS so nothing to do.
POP_ID = 45152) for 1994 and
earlier changed to TATSATUA RIVER (POP_ID = 45154) and
remaining records 1995+ are removed## cu_name_pse cuid IndexId POPULATION
## 1 Taku-Late Timing 1015 CO_45152 Tatsamenie River (Alsek/Taku) Coho
## 2 Taku-Late Timing 1015 CO_45154 Tatsatua River (Alsek/Taku) Coho
## WATERBODY GFE_ID
## 1 TATSATUA CREEK 2211
## 2 TATSATUA CREEK 2212
ISSUE: as we can see above,
POP_ID 45154 is already associated to another
GFE_ID = 2212, which corresponds to a location with the
same WATERBODY but different coordinates:
## GFE_ID CENSUS_SITE X_LONGT Y_LAT
## 1 2212 TATSATUA CREEK -132.1429556 58.53473781
## 2 2211 TATSATUA CREEK -132.3948794 58.4400493
GFE_ID = 2212 is in the Tatsatua Lake whereas
GFE_ID = 2211 is at the mouth of the river.
DECISION
For year <= 1994: replace POP_ID = 45152 by
45154 BUT keep GFE_ID = 2211 (note that
POP_ID = 45151 is then associated to two
GFE_ID, which is not ideal, but at least we preserve the
time series in blue above)
For year > 1994: delete
Any we have to add the reference combing POP_ID = 45154
and GFE_ID = 2211 to CUSS:
## GFE_ID CENSUS_SITE GFE_TYPE SPECIES_QUALIFIED Y_LAT X_LONGT
## NA 2211 TATSATUA CREEK Stream CO 58.4400493 -132.3948794
## FAZ_ACRO MAZ_ACRO JAZ_ACRO CU_NAME CU_ACRO CU_LAT CU_LONGT
## NA Taku TBFj Taku+TBFj TAKU-LATE TIMING Taku-late 58.674303 -133.521737
## CU_TYPE CU_INDEX FULL_CU_IN SBJ_ID POP_ID IS_INDICATOR
## NA Current 41 CO-41 2 45154 N
## CMNTS EFFECTIVE_DT
## NA Conservation Unit Data-Rev. 4.1f_20-Apr 2017 - 3-Nov-11
## WATERSHED_CDE
## NA 789-236300-00000-00000-0000-0000-000-000-000-000-000-000
## FWA_WATERSHED_CDE
## NA 700-998191-999517-235623-000000-000000-000000-000000-000000-000000-000000-000000-000000-000000-000000-000000-000000-000000-000000-000000-000000
## coordinates_changed SPECIES species_acronym_ncc IndexId cuid
## NA FALSE Coho CO CO_45154 1015
## cu_name_pse
## NA Taku-Late Timing
POP_ID = 45151 (Tatsamenie River
lake-type sockeye) for 1994 or earlier get changed to
POP_ID = 45153 (TATSATUA RIVER river-type sockeye)## cu_name_pse cuid IndexId
## 1 Tatsamenie 1029 SX_45151
## 2 Northern Transboundary Fjords (river-type) 1023 SX_45153
## POPULATION WATERBODY GFE_ID
## 1 Tatsamenie River (Alsek/Taku) Sockeye TATSATUA CREEK 2211
## 2 Tatsatua River (Alsek/Taku) Sockeye TATSATUA CREEK 2212
ISSUE: POP_ID = 45153 is already
associated to another GFE_ID = 2212. However, the data
points are almost duplicated. Noe that the ESTIMATE_METHOD
for the alternative series in blue is “Unknown Estimate Method”:
## WATERBODY Year IndexId.x MAX_ESTIMATE.x ESTIMATE_METHOD.x IndexId.y
## 1 TATSATUA CREEK 1985 SX_45151 13093 Fixed Site Census SX_45153
## 2 TATSATUA CREEK 1986 SX_45151 11446 Fixed Site Census SX_45153
## 3 TATSATUA CREEK 1987 SX_45151 2794 Fixed Site Census SX_45153
## 4 TATSATUA CREEK 1988 SX_45151 2063 Fixed Site Census SX_45153
## 5 TATSATUA CREEK 1989 SX_45151 3039 Fixed Site Census SX_45153
## 6 TATSATUA CREEK 1990 SX_45151 5736 Fixed Site Census SX_45153
## 7 TATSATUA CREEK 1991 SX_45151 8381 Fixed Site Census SX_45153
## 8 TATSATUA CREEK 1992 SX_45151 6576 Fixed Site Census <NA>
## 9 TATSATUA CREEK 1993 SX_45151 5028 Fixed Site Census <NA>
## 10 TATSATUA CREEK 1994 SX_45151 4371 Fixed Site Census <NA>
## MAX_ESTIMATE.y ESTIMATE_METHOD.y
## 1 12700 Unknown Estimate Method
## 2 11368 Unknown Estimate Method
## 3 2649 Unknown Estimate Method
## 4 2063 Unknown Estimate Method
## 5 3039 Unknown Estimate Method
## 6 5706 Unknown Estimate Method
## 7 8231 Unknown Estimate Method
## 8 NA <NA>
## 9 NA <NA>
## 10 NA <NA>
DECISION
For years <= 1994: change POP_ID = 45151 to
POP_ID = 45153 and GFE_ID = 2211 to
GFE_ID = 2212
Delete the existing series POP_ID = 45153 and
GFE_ID = 2212
POP_ID = 45165 (Chinook Run 2) change
to 45164 (Chinook Run 1) in Nahlin river## IndexId cuid POPULATION WATERBODY GFE_ID
## 1 CN_45164 1006 Nahlin River (Alsek/Taku) Chinook Run 1 NAHLIN RIVER 2224
Looks like the change was done in the process above, as shown in our records:
## IndexId GFE_ID dataset
## 1 CN_45165 2224 NUSEDS
## comment
## 1 The one data point was merged to series CN_45164 & GFE_ID = 2224 as it is the same CU: CU_NAME: TAKU-EARLY TIMING
Merge:
col_common <- colnames(all_areas_nuseds)[colnames(all_areas_nuseds) %in% colnames(conservation_unit_census_sites)]
col_common <- col_common[col_common != "species_acronym_ncc"]
col_nuseds <- colnames(all_areas_nuseds)[!colnames(all_areas_nuseds) %in% colnames(conservation_unit_census_sites)]
col_cuss <- colnames(conservation_unit_census_sites)[!colnames(conservation_unit_census_sites) %in% colnames(all_areas_nuseds)]
nuseds_final <- base::merge(x = all_areas_nuseds[,c(col_common,col_nuseds)],
y = conservation_unit_census_sites[,c(col_common,col_cuss)],
by = col_common,
all.x = T)
There should not be any duplicate years, is that true?
## [1] "YES :-)"
All series in NUSEDS should be in CUSS, is that true?
## [1] "YES :-)"
All locations have coordinates, is that true?
## [1] "YES :-)"
And a same location does not has different coordinates. Is that true?
## [1] "YES :-)"
No series are missing a CU-association (i.e. missing both a
FULL_CU_IN and a cuid), correct?
## [1] "YES :-)"
No IndeId/POP_ID is associated to multiple CUs,
correct?
## [1] "YES :-)"
The following files are exported:
1_NuSEDS_escapement_data_collated_DATE.csv: the cleaned combined NUSEDS and CUSS datasets
1_series_inNUSEDS_noInCUSS_DATE.csv: information about the series in NUSEDS and not in CUSS
1_series_removed_DATE.csv: the series removed from either NUSEDS or CUSS and why
1_series_added_DATE.csv: the series added to either NUSEDS or CUSS and why
log_file.csv: the log file reporting the name of the main file exported, the date of the export, the name of the present script, the name of the original NUSEDS or CUSS files and the choices related to removing zeros or not.