TL;DR

Use the following table to search for indexed scientific journals, check for their impact factor (IF) and see if they are open access (OA) and covered within the “Read and Publish” agreement for Australian and New Zealand academics.

Now for the full story:

Academic Publishing and Open Access

In the academic world you are being measured by a very cruel metric – your publication and citation index, which is derived from the number of scientific articles (and books) published, their impact factor (IF) and the number of times they were cited (often measured as H-index). These measures are closely looked at and compared when selecting candidates for research and lecturing positions and in many cases can determine a career trajectory.

A major emphasis is given in recent years to democratise scientific findings, so academics are encouraged to publish their findings in Open Access (OA) journals, bypassing the so called “paywall” and making them widely available to the public. However, publishing is such journals is often very expensive and can reach thousands of dollars, thus discriminating against students, early career researchers and scientists from low-income countries who are unable to afford it.

CAUL APC Agreement

A very welcomed change is coming from The Council of Australian University Librarians (CAUL) consortium, who have signed a collective agreement with leading publishers to allow Australian and NZ academics to publish in selected OA journals free-of-charge 😊.

The agreement currently covers more than 5,300 entries (available as an online table), however, it is extremely difficult searching for a relevant journal and more than that, finding its impact factor, which as said, is a major consideration when choosing a journal to publish your hard-earned precious findings.

Finding a Journal’s Impact Factor

Despite its importance, it’s not as easy as you would think finding the impact factor of a journal…
A couple of options are the official Web of Science Journal Citation Reports (JCR) platform, but this service requires registering a user account and logging in (you can use your ANZ University credentials using AAF). In addition this service doesn’t allow for batch searching and downloading and again makes the task of finding a relevant Open Access journal’s IF very time consuming and laborious.

Other websites took a step forward and compiled a list of these journals and their IF from the official JCR website and made them available for download in Excel format (2021 IF spreadsheet at impactfactorforjournal.com/jcr-2021) or as a PDF document for 2023 IF rankings (impactfactorforjournal.com/jcr-impact-factor-2022, or phdtalk.org), which makes searching through the tables much easier. Furthermore, phdtalk.org website contains many useful resources for PhD students, including an online tool (journalfinder.phdtalks.org) for searching indexed and OA journals by keywords and/or category, but still, this useful tool does not provide IF for the journals it finds.

Another useful source for journal ranking is the Scimago Journal Ranking (SJR), which provides a searchable (and downloadable!) table of journals, with their SJR ranking, H-index and the journal’s ranking in the discipline/category (Q1/2/3/4).

Searchable Open Access Journal Impact Factor Table

This is where my R and Rmarkdown skills come handy!
I used a handful of very useful packages for this work:

  • pak is my new favourite way to install packages in R, it supports easy, quick and stress-free installation of packages from CRAN, GitHub and Bioconductor.
  • pacman, which provides a one-liner command to load/install packages and even supports NSE (no need to quote the package names, tidyverse-style).
  • dplyr for general data wrangling and readxl (which is actually a part of the tidyverse) to read data from Excel sheets and CSV files.
  • snakecase to consistently convert strings (journal titles) to facilitate merging of the different tables.
  • janitor, which has a lot of useful functions for cleaning and tidying datasets.
# install needed packages
packs <- c('tidyverse', 'readxl', 'snakecase', 'janitor',  'pacman', 'glue')
# install needed packages with pak
install.packages("pak")
pak::pak(packs)
# load required packages (using pacman package)
library(pacman)
p_load(p_load(char = basename(packs)))

The following code snippets describe how I compiled a searchable IF table for OA journals covered by the CAUL agreement by combining the available tables from CAUL (OA journals) and JCR (IF table). The pre-compiled IF table was downloaded from impactfactorforjournal.com (scroll to the very bottom of the page or use this direct download link JCR2021.xlsx for 2021 JCR IF ranking).
The 2023 IF rankings were provided as a PDF file at impactfactorforjournal.com/jcr-impact-factor-2022 (scroll to the very bottom of the page or use this direct download link Impact-Factor-2023-PDF.pdf). To use this file, I copied the table content and pasted it into Excel as a CSV file (JCR2023.csv) that we will process below. Another alternative for a fully programmatic approach it to parse the PDF and extract the table from it with tools such as pdfx (Python-based) or the R packages tabulizer or tabulapdf.

The data table was read from the spreadsheet and cleaned with various functions from janitor package to “normalise” headers (lower caps, with underscores to replace spaces) and sanitise journal titles (convert all to “Title Case” while keeping common abbreviations intact and replace ‘&’ with ‘and’). Details of additional journals (that I publish at) were added to the data based on individual search at CJR (feel free to add additional journals that you feel are missing from the dataset).

# define common abbreviations (that won't be converted by snakecase)
journal_abvs <- c( "ICES", "BMC","BMJ", "ACTA", "ACS", "AAPG", "AAPS", "AATCC", "ACM", "AAS", "ACI", "AEU", "AERA", "AI", "AIAA", "AIBR", "AIDS", "AIMS", "ASTIN",
                  "AJAR", "AJIDD", "AJIL", "AKCE", "ALEA", "AMB", "AMA", "ALTEX", "ATW", "PDE", "ANZ", "AoB", "AORN", "APL", "APMIS", "ARDEA", "ARS", "ASAIO",
                  "ASCE", "ASME", "ASHRAE", "AStA", "ATLA", "AUK", "BJS", "BJU", "BMB", "BMGN", "BMJ", "BRQ", "BSGF","CA", "CBE", "COMPEL", "COPD", "CMC",
                  "CUAJ", "CPT","CTS", "DARU","EFSA", "ENT","EPE", "ESAIM", "EURE", "GAIA", "GLQ", "HAHR", "HAU","ICGA", "ICHNOS", "ICON", "IEEE","IEICE", "IRAL","ISI", "ISJ","ITE", "ITEA",
                  "JAAPA", "JACC", "JAIDS", "JANAC",'JAMA', "JARO", "JARQ", "JASSS", "JAVMA", "JAVNOST", "JCMS", "JCPSP", "JCR", "JNCI", "JNP", "JOGNN", "JPAD", 
                  "JPC","JSLS","KGK","LIBRI","LHB", "LUTS", "LWT", "MAPAN", "MCN", "MMWR", "NAUNYN", "NJAS", "NODEA", "NWIG", "ORL", "OTJR", "PFG", "QJM", "PS",
                  "QME", "RAE", "RAIRO", "RBGN", "REAL", "REDIA", "REVSTAT", "RIDE", "RILCE", "RLA", "ROFO", "RSF", "SADHANA", "SEN",  "SHILAP", "SIAM", "SJR", "TAPPI",
                  "TOPIA", "TRAC" , "TRAMES", "UHOD", "VIAL", "ZAMM", "ZDM", "ZKG"
                  ) # n"CA","CAA",
# create a function to sanitise journal titles
sanitize_title <- function(j, abbr_list=journal_abvs, change_allcaps_only = FALSE){
  if (isTRUE(change_allcaps_only) & grepl("[a-z]+", j)) {
    return(gsub("&", "and", j, fixed = TRUE) %>% gsub("-", " - ", ., fixed = TRUE))
  }
  j <- to_title_case(gsub("&", "and", j, fixed = TRUE), sep_in = " ",
                                  abbreviations = abbr_list, parsing_option=3)
  return(j %>% gsub("-", " - ", ., fixed = TRUE))
}
regex_query <- "(\\d+)\\s+([\\w\\-\\s\\&]+)\\s+([\\d\\.]+)$"
# Using the 2023 CSV and parsing the single-field into Rank, Title and IF
JCR_IF_table <- read_csv("data/JCR2023.csv", col_names = c("description")) %>% 
  remove_empty(which = "rows") %>% # filter(description=="")
  distinct() %>%
  mutate(rank = str_extract(description, regex_query, 1),
         full_journal_title = str_extract(description, regex_query, 2),
         journal_IF = as.numeric(str_extract(description, regex_query, 3)),
         # title = gsub("&", "and", full_journal_title, fixed = TRUE),
         full_title=map_chr(full_journal_title, .f = sanitize_title)) %>% 
  filter(!is.na(full_journal_title))


# manually add IF data for journals not found in the database from https://jcr.clarivate.com/jcr/browse-journals
my_IF_table <- tribble(~full_title, ~journal_IF, # ~Publisher,
                       #"Microbial Genomics", 3.8, #"Microbiology Society"
                       "Genes", 2.8, 
                       #"Scientific Reports", 4,
                       "SpringerPlus", 1.130,
                       # "Frontiers in Chemistry", 3.8,
                       # "Frontiers in Plant Science", 4.1,
                       # "Frontiers in Genetics", 2.8
                       )
# Using the 2021 Excel file
# IF_table <- read_excel("data/JCR2021.xlsx", sheet = "Table 1", skip = 2) %>% 
#   clean_names() %>% 
#     # filter(journal_impact_factor!="Not Available") %>%
#     remove_empty(which = "rows") %>%
#     distinct() %>%
#     mutate(journal_IF=as.numeric(journal_impact_factor),
#            full_title=to_title_case(gsub("&", "and", full_journal_title, fixed = TRUE), 
#                                     abbreviations = journal_abvs, parsing_option=3)) %>% 
#   bind_rows(my_IF_table)

The SJR table was downloaded from the Scimago Journal Ranking website, processed in a similar way to the JCR table and the 2 tables were combined based on the journal title.

SJR_table <- read_delim("data/scimagojr 2023.csv", delim = ";") %>% 
   clean_names(case="mixed", abbreviations = c("SJR", "ISSN", "ID")) %>% 
  mutate(full_title=gsub("Thalassas:.+", "Thalassas", Title),
         full_title=map_chr(full_title, .f = sanitize_title),
         SJR = as.numeric(sub(",", ".", SJR))) %>% 
  select(Title, full_title, SJR, H_index, Categories, Publisher)

The table of OA journals covered under the CAUL agreement was downloaded from Airtable and saved as CAUL_agreement_all_publishers_2024.csv file (see 1)

Downloading the complete table of OA journals under the CAUL agreement from Airtable

Figure 1: Downloading the complete table of OA journals under the CAUL agreement from Airtable

The table was imported to R and processed in a similar way to the IF data, to try and achieve consistent journal names that could be joined.

free_journals <- read_csv("data/CAUL_agreement_all_publishers_2024.csv", col_types = "cccccc") %>% 
  clean_names("mixed", abbreviations = c("CAUL", "OA", "ISSN")) %>% 
  mutate(full_title=gsub("Thalassas:.+", "Thalassas", Journal_Name),
         full_title=map_chr(full_title, .f = sanitize_title))

Then, all 3 tables were joined.

complete_IF_table <- JCR_IF_table %>% 
  select(full_title, journal_IF) %>%
  bind_rows(my_IF_table) %>%
  distinct() %>% 
  full_join(SJR_table) %>% 
  full_join(free_journals %>% select(-Journal_Name),by = "full_title") %>% 
  distinct() %>% 
  mutate(Publisher=ifelse(!is.na(Publisher) & !is.na(Publisher_Name), glue::glue("{Publisher}/{Publisher_Name}"), 
                          ifelse(!is.na(Publisher) & is.na(Publisher_Name),Publisher, Publisher_Name)),
    Included_in_CAUL_OA_agreement=replace_na(Included_in_CAUL_OA_agreement, "No")) %>% 
  select(!contains("Publisher_")) 

The resulting table can be presented in a Shiny app or R Markdown file (like this one) with the DT or reactable packages.

complete_IF_table %>% 
  select(Journal=full_title, impact_factor_2023=journal_IF, SJR, H_index, 
                             Categories, Included_in_CAUL_OA_agreement, 
                             Publisher) %>% # Doi 
  clean_names("title", abbreviations = c("ISSN", "DOI", "IF", "CAUL","AO", "SJR")) %>% 
  reactable(columns = list(
    Journal = colDef(width = 275),   # 50% width, 200px minimum
    Categories = colDef(width = 275),
    SJR = colDef(width = 70),
    "H Index" = colDef(width = 70),
    Publisher = colDef(width = 120)
  ),
  filterable = TRUE, searchable = TRUE, resizable = TRUE, bordered = TRUE, striped = TRUE, highlight = TRUE)