To determine the questions and methods folks have been interested in, we searched for capture-recapture papers in the Web of Science. We found more than 5000 relevant papers during the 2009-2019 period.
To make sense of this big corpus, we carried out bibliometric and
textual analyses in the spirit of Nakagawa
et al. 2018. Explanations along with the code and results are in the
next section
Quantitative analyses: Bibliometric and textual analyses
.
We also inspected a sample of methodological and ecological papers, see
section
Qualitative analyses: Making sense of the corpus of scientific papers on capture-recapture
.
To carry out a bibliometric analysis of the capture-recapture
literature over the 2009-2019, we used the R
package bibliometrix. We also carried
out a text analysis using topic modelling, for which we recommend the
book Text Mining with
R.
To collect the data, we used the following settings:
We load the packages we need:
library(bibliometrix) # bib analyses
library(quanteda) # textual data analyses
library(tidyverse) # manipulation and viz data
library(tidytext) # handle text
library(topicmodels) # topic modelling
Let us read in and format the data:
# Loading txt or bib files into R environment
D <- c("data/savedrecs.txt",
"data/savedrecs(1).txt",
"data/savedrecs(2).txt",
"data/savedrecs(3).txt",
"data/savedrecs(4).txt",
"data/savedrecs(5).txt",
"data/savedrecs(6).txt",
"data/savedrecs(7).txt",
"data/savedrecs(8).txt",
"data/savedrecs(9).txt",
"data/savedrecs(10).txt")
# Converting the loaded files into a R bibliographic dataframe
# (takes a minute or two)
M <- convert2df(D, dbsource="wos", format="plaintext")
##
## Converting your wos collection into a bibliographic dataframe
##
## Done!
##
##
## Generating affiliation field tag AU_UN from C1: Done!
We ended up with 5022 articles. Note that WoS only allows 500 items to be exported at once, therefore we had to repeat the same operation multiple times.
We export back as a csv file for further inspection:
WoS provides the user with a bunch of graphs, let’s have a look.
Research areas are:
The number of publications per year is:
The countries of the first author are:
The journals are:
The most productive authors are:
The graphs for the dataset of citing articles (who uses and what capture-recapture are used for) show the same patterns as the dataset of published articles, except for the journals. There are a few different journals from which a bunch of citations are coming from, namely Biological Conservation, Scientific Reports, Molecular Ecology and Proceedings of the Royal Society B - Biological Sciences:
We also want to produce our own descriptive statistics. Let’s have a
look to the data with R
.
Number of papers per journal:
dat <- as_tibble(M)
dat %>%
group_by(SO) %>%
count() %>%
filter(n > 50) %>%
ggplot(aes(n, reorder(SO, n))) +
geom_col() +
labs(title = "Nb of papers per journal", x = "", y = "")
Most common words in titles:
wordft <- dat %>%
mutate(line = row_number()) %>%
filter(nchar(TI) > 0) %>%
unnest_tokens(word, TI) %>%
anti_join(stop_words)
wordft %>%
count(word, sort = TRUE)
wordft %>%
count(word, sort = TRUE) %>%
filter(n > 200) %>%
mutate(word = reorder(word, n)) %>%
ggplot(aes(n, word)) +
geom_col() +
labs(title = "Most common words in titles", x = "", y = "")
Most common words in abstracts:
wordab <- dat %>%
mutate(line = row_number()) %>%
filter(nchar(AB) > 0) %>%
unnest_tokens(word, AB) %>%
anti_join(stop_words)
wordab %>%
count(word, sort = TRUE)
Now we turn to a more detailed analysis of the published articles.
First calculate the main bibliometric measures:
results <- biblioAnalysis(M, sep = ";")
options(width=100)
S <- summary(object = results, k = 10, pause = FALSE)
##
##
## MAIN INFORMATION ABOUT DATA
##
## Timespan 2009 : 2019
## Sources (Journals, Books, etc) 808
## Documents 5022
## Annual Growth Rate % -3.7
## Document Average Age 8.98
## Average citations per doc 11.54
## Average citations per year per doc 1.014
## References 134769
##
## DOCUMENT TYPES
## article 4940
## article; book chapter 6
## article; early access 7
## article; proceedings paper 69
##
## DOCUMENT CONTENTS
## Keywords Plus (ID) 10861
## Author's Keywords (DE) 11088
##
## AUTHORS
## Authors 15128
## Author Appearances 23004
## Authors of single-authored docs 174
##
## AUTHORS COLLABORATION
## Single-authored docs 201
## Documents per Author 0.332
## Co-Authors per Doc 4.58
## International co-authorships % 33.43
##
##
## Annual Scientific Production
##
## Year Articles
## 2009 369
## 2010 401
## 2011 456
## 2012 492
## 2013 486
## 2014 526
## 2015 497
## 2016 496
## 2017 527
## 2018 512
## 2019 253
##
## Annual Percentage Growth Rate -3.7
##
##
## Most Productive Authors
##
## Authors Articles Authors Articles Fractionalized
## 1 GIMENEZ O 82 GIMENEZ O 17.95
## 2 PRADEL R 65 ROYLE JA 15.92
## 3 ROYLE JA 59 PRADEL R 12.82
## 4 CHOQUET R 44 BOHNING D 10.78
## 5 BARBRAUD C 40 CHOQUET R 9.91
## 6 BESNARD A 38 BARBRAUD C 9.12
## 7 TAVECCHIA G 34 WHITE GC 7.84
## 8 ORO D 32 SCHAUB M 7.78
## 9 NICHOLS JD 31 KING R 7.69
## 10 SCHAUB M 29 BESNARD A 7.51
##
##
## Top manuscripts per citations
##
## Paper DOI TC TCperYear NTC
## 1 CHOQUET R, 2009, ECOGRAPHY 10.1111/j.1600-0587.2009.05968.x 414 27.6 15.08
## 2 WHITEHEAD H, 2009, BEHAV ECOL SOCIOBIOL 10.1007/s00265-008-0697-y 350 23.3 12.75
## 3 LUIKART G, 2010, CONSERV GENET 10.1007/s10592-010-0050-7 289 20.6 13.89
## 4 GLANVILLE J, 2009, P NATL ACAD SCI USA 10.1073/pnas.0909775106 251 16.7 9.14
## 5 PATTERSON CC, 2012, DIABETOLOGIA 10.1007/s00125-012-2571-8 237 19.8 14.65
## 6 WALLACE BP, 2010, PLOS ONE 10.1371/journal.pone.0015465 207 14.8 9.95
## 7 GOMEZ P, 2011, SCIENCE 10.1126/science.1198767 195 15.0 10.27
## 8 MERTES PM, 2011, J ALLERGY CLIN IMMUN 10.1016/j.jaci.2011.03.003 165 12.7 8.69
## 9 ROYLE JA, 2009, ECOLOGY 10.1890/08-1481.1 158 10.5 5.76
## 10 SOMERS EC, 2014, ARTHRITIS RHEUMATOL 10.1002/art.38238 156 15.6 13.01
##
##
## Corresponding Author's Countries
##
## Country Articles Freq SCP MCP MCP_Ratio
## 1 USA 1784 0.3567 1454 330 0.185
## 2 AUSTRALIA 326 0.0652 202 124 0.380
## 3 FRANCE 318 0.0636 198 120 0.377
## 4 UNITED KINGDOM 318 0.0636 184 134 0.421
## 5 CANADA 304 0.0608 199 105 0.345
## 6 SPAIN 157 0.0314 95 62 0.395
## 7 ITALY 148 0.0296 89 59 0.399
## 8 GERMANY 146 0.0292 66 80 0.548
## 9 NEW ZEALAND 133 0.0266 74 59 0.444
## 10 BRAZIL 129 0.0258 95 34 0.264
##
##
## SCP: Single Country Publications
##
## MCP: Multiple Country Publications
##
##
## Total Citations per Country
##
## Country Total Citations Average Article Citations
## 1 USA 21915 12.28
## 2 FRANCE 4422 13.91
## 3 UNITED KINGDOM 4374 13.75
## 4 AUSTRALIA 3740 11.47
## 5 CANADA 3466 11.40
## 6 GERMANY 2003 13.72
## 7 NEW ZEALAND 1931 14.52
## 8 ITALY 1585 10.71
## 9 SWITZERLAND 1464 23.24
## 10 SPAIN 1429 9.10
##
##
## Most Relevant Sources
##
## Sources Articles
## 1 PLOS ONE 219
## 2 JOURNAL OF WILDLIFE MANAGEMENT 170
## 3 ECOLOGY AND EVOLUTION 116
## 4 ECOLOGY 101
## 5 BIOLOGICAL CONSERVATION 99
## 6 JOURNAL OF ANIMAL ECOLOGY 80
## 7 METHODS IN ECOLOGY AND EVOLUTION 77
## 8 JOURNAL OF MAMMALOGY 73
## 9 JOURNAL OF APPLIED ECOLOGY 72
## 10 NORTH AMERICAN JOURNAL OF FISHERIES MANAGEMENT 65
##
##
## Most Relevant Keywords
##
## Author Keywords (DE) Articles Keywords-Plus (ID) Articles
## 1 MARK-RECAPTURE 687 SURVIVAL 647
## 2 CAPTURE-RECAPTURE 460 CONSERVATION 525
## 3 SURVIVAL 326 CAPTURE-RECAPTURE 497
## 4 CAPTURE-MARK-RECAPTURE 246 ABUNDANCE 494
## 5 ABUNDANCE 173 POPULATION 491
## 6 POPULATION DYNAMICS 145 MARKED ANIMALS 404
## 7 DEMOGRAPHY 140 SIZE 371
## 8 DISPERSAL 138 POPULATIONS 339
## 9 CONSERVATION 131 MARK-RECAPTURE 328
## 10 POPULATION SIZE 125 DYNAMICS 302
Visualize:
The 100 most frequent cited manuscripts:
## [,1]
## WHITE GC, 1999, BIRD STUDY, V46, P120 1310
## BURNHAM K, 2002, MODEL SELECTION MULT 1131
## LEBRETON JD, 1992, ECOL MONOGR, V62, P67, DOI 10.2307/2937171 835
## WILLIAMS B. K., 2002, ANAL MANAGEMENT ANIM 546
## OTIS DL, 1978, WILDLIFE MONOGR, P1 536
## JOLLY GM, 1965, BIOMETRIKA, V52, P225, DOI 10.1093/BIOMET/52.1-2.225 368
## SEBER GAF, 1965, BIOMETRIKA, V52, P249 320
## CHOQUET R, 2009, ECOGRAPHY, V32, P1071, DOI 10.1111/J.1600-0587.2009.05968.X 313
## SEBER GA, 1982, ESTIMATION ANIMAL AB 306
## KENDALL WL, 1997, ECOLOGY, V78, P563 277
## BORCHERS DL, 2008, BIOMETRICS, V64, P377, DOI 10.1111/J.1541-0420.2007.00927.X 265
## CORMACK RM, 1964, BIOMETRIKA, V51, P429, DOI 10.1093/BIOMET/51.3-4.429 243
## POLLOCK KH, 1982, J WILDLIFE MANAGE, V46, P752, DOI 10.2307/3808568 233
## EFFORD M, 2004, OIKOS, V106, P598, DOI 10.1111/J.0030-1299.2004.13043.X 228
## PRADEL R, 1996, BIOMETRICS, V52, P703, DOI 10.2307/2532908 217
## KARANTH KU, 1998, ECOLOGY, V79, P2852 214
## CASWELL H., 2001, MATRIX POPULATION MO 203
## HUGGINS RM, 1989, BIOMETRIKA, V76, P133, DOI 10.1093/BIOMET/76.1.133 203
## POLLOCK KH, 1990, WILDLIFE MONOGR, P1 203
## SCHWARZ CJ, 1996, BIOMETRICS, V52, P860, DOI 10.2307/2533048 201
## PRADEL R, 2005, BIOMETRICS, V61, P442, DOI 10.1111/J.1541-0420.2005.00318.X 196
## BROWNIE C, 1993, BIOMETRICS, V49, P1173, DOI 10.2307/2532259 195
## HOOK EB, 1995, EPIDEMIOL REV, V17, P243, DOI 10.1093/OXFORDJOURNALS.EPIREV.A036192 193
## CHOQUET R, 2009, ENVIRON ECOL STAT SE, V3, P845, DOI 10.1007/978-0-387-78151-8_39 185
## PRADEL R, 1997, BIOMETRICS, V53, P60, DOI 10.2307/2533097 185
## PLEDGER S, 2000, BIOMETRICS, V56, P434, DOI 10.1111/J.0006-341X.2000.00434.X 171
## ROYLE JA, 2014, SPATIAL CAPTURE-RECAPTURE, P1 166
## STEARNS SC, 1992, EVOLUTION LIFE HIST 163
## BUCKLAND S. T, 2001, INTRO DISTANCE SAMPL 162
## KERY M, 2012, BAYESIAN POPULATION ANALYSIS USING WINBUGS: A HIERARCHICAL PERSPECTIVE, P1 154
## BURNHAM K. P., 1998, MODEL SELECTION INFE 147
## ROYLE JA, 2008, ECOLOGY, V89, P2281, DOI 10.1890/07-0601.1 146
## HUGGINS RM, 1991, BIOMETRICS, V47, P725, DOI 10.2307/2532158 143
## ROYLE J. A., 2008, HIERARCHICAL MODELIN 139
## ROYLE JA, 2009, ECOLOGY, V90, P3233, DOI 10.1890/08-1481.1 125
## LEBRETON JD, 2002, J APPL STAT, V29, P353, DOI 10.1080/02664760120108638 124
## MACKENZIE DI, 2002, ECOLOGY, V83, P2248, DOI 10.2307/3072056 123
## LEBRETON JD, 2009, ADV ECOL RES, V41, P87, DOI 10.1016/S0065-2504(09)00403-6 120
## KARANTH KU, 1995, BIOL CONSERV, V71, P333, DOI 10.1016/0006-3207(94)00057-W 118
## SAETHER BE, 2000, ECOLOGY, V81, P642, DOI 10.2307/177366 118
## MACKENZIE DI, 2006, OCCUPANCY ESTIMATION 114
## KENDALL WL, 1995, BIOMETRICS, V51, P293, DOI 10.2307/2533335 113
## BURNHAM K.P., 2002, MODEL SELECTION INFE 110
## HESTBECK JB, 1991, ECOLOGY, V72, P523, DOI 10.2307/2937193 106
## AMSTRUP SC, 2005, HANDBOOK OF CAPTURE-RECAPTURE ANALYSIS, P1 104
## BROOKS SP, 1998, J COMPUT GRAPH STAT, V7, P434, DOI 10.2307/1390675 103
## SCHWARZ CJ, 1993, BIOMETRICS, V49, P177, DOI 10.2307/2532612 103
## WOODS JG, 1999, WILDLIFE SOC B, V27, P616 103
## KENDALL WL, 1999, ECOLOGY, V80, P2517, DOI 10.1890/0012-9658(1999)0802517:ROCCRM2.0.CO 102
## GROSBOIS V, 2008, BIOL REV, V83, P357, DOI 10.1111/J.1469-185X.2008.00047.X 101
## R CORE TEAM, 2015, R LANG ENV STAT COMP 101
## WAITS LP, 2001, MOL ECOL, V10, P249, DOI 10.1046/J.1365-294X.2001.01185.X 100
## CHAO A, 1987, BIOMETRICS, V43, P783, DOI 10.2307/2531532 99
## CHAO A, 2001, STAT MED, V20, P3123, DOI 10.1002/SIM.996.ABS 99
## LINK WA, 2003, BIOMETRICS, V59, P1123, DOI 10.1111/J.0006-341X.2003.00129.X 99
## AKAIKE H., 1973, 2 INT S INF THEOR, P267, DOI DOI 10.1007/978-1-4612-1694-0_ 97
## BURNHAM KENNETH P., 1993, P199 97
## EFFORD MG, 2009, ENVIRON ECOL STAT SE, V3, P255, DOI 10.1007/978-0-387-78151-8_11 95
## R DEVELOPMENT CORE TEAM, 2011, R LANG ENV STAT COMP 94
## PRADEL R., 2005, ANIMAL BIODIVERSITY AND CONSERVATION, V28, P189 93
## R CORE TEAM, 2013, R LANG ENV STAT COMP 93
## GREENWOOD PJ, 1980, ANIM BEHAV, V28, P1140, DOI 10.1016/S0003-3472(80)80103-5 92
## YIP PSF, 1995, AM J EPIDEMIOL, V142, P1047 92
## PLEDGER S, 2003, BIOMETRICS, V59, P786, DOI 10.1111/J.0006-341X.2003.00092.X 90
## KENDALL WL, 2002, ECOLOGY, V83, P3276 89
## PAETKAU D, 2003, MOL ECOL, V12, P1375, DOI 10.1046/J.1365-294X.2003.01820.X 89
## SOISALO MK, 2006, BIOL CONSERV, V129, P487, DOI 10.1016/J.BIOCON.2005.11.023 89
## WILSON B, 1999, ECOL APPL, V9, P288, DOI 10.2307/2641186 89
## WAITS LP, 2005, J WILDLIFE MANAGE, V69, P1419, DOI 10.2193/0022-541X(2005)691419:NGSTFW2.0.CO 88
## ARNOLD TW, 2010, J WILDLIFE MANAGE, V74, P1175, DOI 10.2193/2009-367 86
## PRITCHARD JK, 2000, GENETICS, V155, P945 86
## R CORE TEAM, 2016, R LANG ENV STAT COMP 86
## SILVER SC, 2004, ORYX, V38, P148, DOI 10.1017/S0030605304000286 86
## SOLLMANN R, 2011, BIOL CONSERV, V144, P1017, DOI 10.1016/J.BIOCON.2010.12.011 86
## GAILLARD JM, 2000, ANNU REV ECOL SYST, V31, P367, DOI 10.1146/ANNUREV.ECOLSYS.31.1.367 85
## GAILLARD JM, 2003, ECOLOGY, V84, P3294, DOI 10.1890/02-0409 85
## R CORE TEAM, 2014, R LANG ENV STAT COMP 84
## KARANTH KU, 2006, ECOLOGY, V87, P2925, DOI 10.1890/0012-9658(2006)872925:ATPDUP2.0.CO 83
## SPIEGELHALTER DJ, 2002, J ROY STAT SOC B, V64, P583, DOI 10.1111/1467-9868.00353 83
## ANDERSON DR, 1994, ECOLOGY, V75, P1780, DOI 10.2307/1939637 82
## GELMAN A, 2004, BAYESIAN DATA ANAL 82
## STANLEY TR, 1999, ENVIRON ECOL STAT, V6, P197, DOI 10.1023/A:1009674322348 82
## GELMAN A, 1992, STAT SCI, V7, P457, DOI DOI 10.1214/SS/1177011136 81
## LUNN DJ, 2000, STAT COMPUT, V10, P325, DOI 10.1023/A:1008929526011 81
## SCHWARZ CJ, 1999, STAT SCI, V14, P427 81
## WILSON KR, 1985, J MAMMAL, V66, P13, DOI 10.2307/1380951 81
## KREBS CJ, 1999, ECOLOGICAL METHODOLO 80
## PULLIAM HR, 1988, AM NAT, V132, P652, DOI 10.1086/284880 80
## FOSTER RJ, 2012, J WILDLIFE MANAGE, V76, P224, DOI 10.1002/JWMG.275 78
## BESBEAS P, 2002, BIOMETRICS, V58, P540, DOI 10.1111/J.0006-341X.2002.00540.X 77
## GAILLARD JM, 1998, TRENDS ECOL EVOL, V13, P58, DOI 10.1016/S0169-5347(97)01237-8 77
## MORRIS W. F., 2002, QUANTITATIVE CONSERV 77
## SCHAUB M, 2004, ECOLOGY, V85, P2107, DOI 10.1890/03-3110 77
## LUKACS PM, 2005, MOL ECOL, V14, P3909, DOI 10.1111/J.1365-294X.2005.02717.X 76
## MILLER CR, 2005, MOL ECOL, V14, P1991, DOI 10.1111/J.1365-294X.2005.02577.X 76
## R DEVELOPMENT CORE TEAM, 2012, R LANG ENV STAT COMP 76
## REXSTAD E., 1991, USERS GUIDE INTERACT 76
## WHITE G. C., 1982, CAPTURE RECAPTURE RE 76
## EFFORD M. G., 2004, ANIMAL BIODIVERSITY AND CONSERVATION, V27, P217 75
## HURVICH CM, 1989, BIOMETRIKA, V76, P297, DOI 10.2307/2336663 75
The most frequent cited first authors:
## [,1]
## WHITE GC 1671
## LEBRETON JD 1254
## ROYLE JA 1249
## BURNHAM K 1144
## PRADEL R 1017
## KENDALL WL 919
## POLLOCK KH 891
## CHOQUET R 858
## NICHOLS JD 671
## R DEVELOPMENT CORE TEAM 648
## KARANTH KU 620
## WILLIAMS B K 602
## OTIS DL 553
## CHAO A 540
## SCHWARZ CJ 512
## R CORE TEAM 511
## SCHAUB M 505
## SEBER GAF 488
## MACKENZIE DI 475
## BURNHAM K P 466
## BURNHAM KP 461
## KERY M 449
## EFFORD MG 435
## GELMAN A 399
## PLEDGER S 399
Top authors productivity over time:
Below is an author collaboration network, where nodes represent top 30 authors in terms of the numbers of authored papers in our dataset; links are co-authorships. The Louvain algorithm is used throughout for clustering:
M <- metaTagExtraction(M, Field = "AU_CO", sep = ";")
NetMatrix <- biblioNetwork(M, analysis = "collaboration", network = "authors", sep = ";")
net <- networkPlot(NetMatrix, n = 30, Title = "Collaboration network", type = "fruchterman",
size = TRUE, remove.multiple = FALSE, labelsize = 0.7, cluster = "louvain")
Country collaborations:
NetMatrix <- biblioNetwork(M, analysis = "collaboration", network = "countries", sep = ";")
net <- networkPlot(NetMatrix, n = 20, Title = "Country collaborations", type = "fruchterman",
size = TRUE, remove.multiple = FALSE, labelsize = 0.7, cluster = "louvain")
A keyword co-occurrences network:
NetMatrix <- biblioNetwork(M, analysis = "co-occurrences", network = "keywords", sep = ";")
netstat <- networkStat(NetMatrix)
summary(netstat, k = 10)
##
##
## Main statistics about the network
##
## Size 10867
## Density 0.002
## Transitivity 0.08
## Diameter 6
## Degree Centralization 0.192
## Average path length 2.772
##
To know everything about textual analysis and topic modelling in particular, we recommend the reading of Text Mining with R.
Clean and format the data:
wordfabs <- dat %>%
mutate(line = row_number()) %>%
filter(nchar(AB) > 0) %>%
unnest_tokens(word, AB) %>%
anti_join(stop_words) %>%
filter(str_detect(word, "[^\\d]")) %>%
group_by(word) %>%
mutate(word_total = n()) %>%
ungroup()
desc_dtm <- wordfabs %>%
count(line, word, sort = TRUE) %>%
ungroup() %>%
cast_dtm(line, word, n)
Perform the analysis, takes several minutes:
Visualise results:
top_terms <- tidy_lda %>%
filter(topic < 13) %>%
group_by(topic) %>%
top_n(10, beta) %>%
ungroup() %>%
arrange(topic, -beta)
top_terms %>%
mutate(term = reorder(term, beta)) %>%
group_by(topic, term) %>%
arrange(desc(beta)) %>%
ungroup() %>%
mutate(term = factor(paste(term, topic, sep = "__"),
levels = rev(paste(term, topic, sep = "__")))) %>%
ggplot(aes(term, beta, fill = as.factor(topic))) +
geom_col(show.legend = FALSE) +
coord_flip() +
scale_x_discrete(labels = function(x) gsub("__.+$", "", x)) +
labs(title = "Top 10 terms in each LDA topic",
x = NULL, y = expression(beta)) +
facet_wrap(~ topic, ncol = 4, scales = "free")
This is quite informative! Topics can fairly easily be interpreted: 1 is about estimating fish survival, 2 is about photo-id, 3 is general about modeling and estimation, 4 is disease ecology, 5 is about estimating abundance of marine mammals, 6 is about capture-recapture in (human) health sciences, 7 is about the conservation of large carnivores (tigers, leopards), 8 is about growth and recruitment, 9 about prevalence estimation in humans, 10 is about the estimation of individual growth in fish, 11 is (not a surprise) about birds (migration and reproduction), and 12 is about habitat perturbations.
Our objective was to make a list of ecological questions and methods that were addressed in these papers. The bibliometric and text analyses above were useful, but we needed to dig a bit deeper to achieve the objective. Here how we did.
First, we isolated the methodological journals. To do so, we focused the search on journals that had published more than 10 papers about capture-recapture over the last 10 years:
raw_dat <- read_csv(file = 'data/crdat.csv')
raw_dat %>%
group_by(journal) %>%
filter(n() > 10) %>%
ungroup() %>%
count(journal)
By inspecting this list, we ended up with these methodological journals:
methods <- raw_dat %>%
filter(journal %in% c('BIOMETRICS',
'ECOLOGICAL MODELLING',
'JOURNAL OF AGRICULTURAL BIOLOGICAL AND ENVIRONMENTAL STATISTICS',
'METHODS IN ECOLOGY AND EVOLUTION',
'ANNALS OF APPLIED STATISTICS',
'ENVIRONMENTAL AND ECOLOGICAL STATISTICS'))
methods %>%
count(journal, sort = TRUE)
Now we exported the 219 papers published in these methodological journals in a csv file:
raw_dat %>%
filter(journal %in% c('BIOMETRICS',
'ECOLOGICAL MODELLING',
'JOURNAL OF AGRICULTURAL BIOLOGICAL AND ENVIRONMENTAL STATISTICS',
'METHODS IN ECOLOGY AND EVOLUTION',
'ANNALS OF APPLIED STATISTICS',
'ENVIRONMENTAL AND ECOLOGICAL STATISTICS')) %>%
write_csv('data/papers_in_methodological_journals.csv')
The next step was to annotate this file to determine the methods
used. R
could not help, and we had to do it by hand. We
read the >200 titles and abstracts and added our tags in an extra
column. The task was cumbersome but very interesting. We enjoyed seeing
what colleagues have been working on. The results are in this
file.
By focusing the annotation on the methodological journals, we ignored all the methodological papers that had been published in other non-methodological journals like, among others, Ecology, Journal of Applied Ecology, Conservation Biology and Plos One which welcome methods. We address this issue below. In brief, we scanned the corpus of ecological papers and tagged all methodological papers (126 in total); we added them to the file of methodological papers and added a column to keep track of the paper original (methodological vs ecological corpus).
Second, we isolated the ecological journals. To do so, we focused the search on journals that had been published more than 50 papers about capture-recapture over the last 10 years, and we excluded the methodological journals:
ecol <- raw_dat %>%
filter(!journal %in% c('BIOMETRICS',
'ECOLOGICAL MODELLING',
'JOURNAL OF AGRICULTURAL BIOLOGICAL AND ENVIRONMENTAL STATISTICS',
'METHODS IN ECOLOGY AND EVOLUTION',
'ANNALS OF APPLIED STATISTICS',
'ENVIRONMENTAL AND ECOLOGICAL STATISTICS')) %>%
group_by(journal) %>%
filter(n() > 50) %>%
ungroup()
ecol %>%
count(journal, sort = TRUE)
## [1] 1378
Again, we inspected the papers one by one. We mainly focused the reading on the titles and abstracts. We did not annotate the papers.
This work initially started as a talk we gave at the Wildlife Research and Conservation 2019 conference in Berlin. The slides can be downloaded here. There is also a video recording of the talk there, and a Twitter thread of it. We also presented a poster at the Euring 2021 conference, see here.
R
version
used## R version 4.2.3 (2023-03-15)
## Platform: aarch64-apple-darwin20 (64-bit)
## Running under: macOS Monterey 12.6
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] topicmodels_0.2-14 tidytext_0.4.1 quanteda_3.3.1 bibliometrix_4.1.3 lubridate_1.9.2 forcats_1.0.0
## [7] stringr_1.5.0 dplyr_1.1.2 purrr_1.0.1 readr_2.1.4 tidyr_1.3.0 tibble_3.2.1
## [13] ggplot2_3.4.2 tidyverse_2.0.0
##
## loaded via a namespace (and not attached):
## [1] TH.data_1.1-2 colorspace_2.1-0 ellipsis_0.3.2 modeltools_0.2-23 estimability_1.4.1
## [6] rstudioapi_0.14 farver_2.1.1 dimensionsR_0.0.3 rscopus_0.6.6 SnowballC_0.7.1
## [11] bit64_4.0.5 ggrepel_0.9.3 DT_0.28 fansi_1.0.4 mvtnorm_1.2-2
## [16] xml2_1.3.4 codetools_0.2-19 splines_4.2.3 leaps_3.1 cachem_1.0.8
## [21] knitr_1.43 jsonlite_1.8.5 cluster_2.1.4 shiny_1.7.4 rentrez_1.2.3
## [26] compiler_4.2.3 httr_1.4.6 emmeans_1.8.5 Matrix_1.5-4 fastmap_1.1.1
## [31] lazyeval_0.2.2 cli_3.6.1 later_1.3.1 htmltools_0.5.5 tools_4.2.3
## [36] NLP_0.2-1 igraph_1.4.2 coda_0.19-4 gtable_0.3.3 glue_1.6.2
## [41] reshape2_1.4.4 FactoMineR_2.8 fastmatch_1.1-3 Rcpp_1.0.10 slam_0.1-50
## [46] cellranger_1.1.0 jquerylib_0.1.4 vctrs_0.6.3 xfun_0.39 stopwords_2.3
## [51] openxlsx_4.2.5.2 timechange_0.2.0 mime_0.12 lifecycle_1.0.3 XML_3.99-0.14
## [56] stringdist_0.9.10 bibliometrixData_0.3.0 MASS_7.3-59 zoo_1.8-12 scales_1.2.1
## [61] vroom_1.6.3 ragg_1.2.5 hms_1.1.3 promises_1.2.0.1 parallel_4.2.3
## [66] sandwich_3.0-2 yaml_2.3.7 sass_0.4.6 stringi_1.7.12 highr_0.10
## [71] tokenizers_0.3.0 zip_2.3.0 systemfonts_1.0.4 rlang_1.1.1 pkgconfig_2.0.3
## [76] evaluate_0.21 lattice_0.21-8 labeling_0.4.2 htmlwidgets_1.6.2 bit_4.0.5
## [81] tidyselect_1.2.0 plyr_1.8.8 magrittr_2.0.3 R6_2.5.1 generics_0.1.3
## [86] multcompView_0.1-9 multcomp_1.4-23 pillar_1.9.0 withr_2.5.0 survival_3.5-5
## [91] scatterplot3d_0.3-44 crayon_1.5.2 janeaustenr_1.0.0 utf8_1.2.3 plotly_4.10.1
## [96] tzdb_0.4.0 rmarkdown_2.22 grid_4.2.3 readxl_1.4.2 data.table_1.14.8
## [101] digest_0.6.31 flashClust_1.01-2 tm_0.7-11 xtable_1.8-4 httpuv_1.6.11
## [106] textshaping_0.3.6 RcppParallel_5.1.7 stats4_4.2.3 munsell_0.5.0 viridisLite_0.4.2
## [111] pubmedR_0.0.3 bslib_0.5.0