| Title: | Access Federal, State, and Local Election Data |
|---|---|
| Description: | Provides an 'R' interface for downloading and standardizing election data to support research workflows. Election results are published by states through heterogeneous and often dynamic web interfaces that are not consistently accessible through existing 'R' packages or APIs. To address this, the package wraps state-specific 'Python' web scrapers through the 'reticulate' package, enabling access to dynamic content while exposing consistent 'R' functions for querying election availability and results across jurisdictions. The package is intended for responsible use and relies on publicly accessible election result pages. |
| Authors: | Graham Chickering [aut, cre], Chris Warshaw [ctb] |
| Maintainer: | Graham Chickering <[email protected]> |
| License: | Apache License (>= 2.0) |
| Version: | 0.1.0 |
| Built: | 2026-06-07 10:48:35 UTC |
| Source: | https://github.com/gchickering21/downballotr |
Returns a data frame listing the earliest available year for each state and scraper source tracked by DownBallotR. All sources include data through the current calendar year.
db_available_years(state = NULL)db_available_years(state = NULL)
state |
Optional state name to filter results (e.g. |
A data.frame with columns source, state,
start_year, and end_year.
# All sources db_available_years() # Filter to one state db_available_years(state = "Virginia")# All sources db_available_years() # Filter to one state db_available_years(state = "Virginia")
List all registered Python scraper sources
db_list_sources()db_list_sources()
Character vector of source names.
List states supported by DownBallotR scrapers
db_list_states(source = NULL)db_list_states(source = NULL)
source |
One of the sources returned by |
Named character vector of canonical state names. When
source = NULL each element is named by its source; when a single
source is given the names are omitted.
Creates/uses a named virtual environment and installs Python requirements (pandas, requests, lxml, bs4, playwright), then installs Playwright Chromium.
downballot_install_python( envname = "downballotR", python = NULL, reinstall = FALSE, install_chromium = TRUE, quiet = FALSE )downballot_install_python( envname = "downballotR", python = NULL, reinstall = FALSE, install_chromium = TRUE, quiet = FALSE )
envname |
Name of the virtualenv to create/use. |
python |
Path to a python executable to use when creating the env (optional). |
reinstall |
If TRUE, reinstall packages even if already installed. |
install_chromium |
If TRUE, install Playwright Chromium browser.
In interactive sessions, the user will be prompted for explicit consent
before the download (~100-200 MB) begins. In non-interactive sessions,
the function will error if Chromium is missing; set
|
quiet |
If TRUE, suppress progress messages. |
Python must already be installed on your system before calling this
function. downballot_install_python() creates a virtual environment
using an existing Python interpreter — it does not install Python itself.
If Python is not found, reticulate will error with a message about
being unable to create a virtualenv.
Windows: Install Python from https://www.python.org/downloads/. Make sure to check "Add Python to PATH" during installation.
macOS: Python 3 is available via Xcode Command Line Tools
(xcode-select --install) or https://www.python.org/downloads/.
Linux: Install via your package manager, e.g.
sudo apt install python3 python3-venv (Debian/Ubuntu) or
sudo dnf install python3 (Fedora/RHEL).
If the environment already exists and all required packages are present,
the function prints a message and returns without doing work (unless Chromium
is missing and install_chromium = TRUE). In all cases, it attempts to
initialize reticulate to the selected interpreter for this session.
Called for side effects. Returns invisible(TRUE) on success,
or invisible(FALSE) if the user declines the Chromium download.
Reports whether the Python virtual environment exists, whether reticulate is initialized (and which Python is active), which required packages are missing, and whether Playwright Chromium is available.
downballot_python_status( envname = "downballotR", required_pkgs = db_required_python_packages(), quiet = FALSE )downballot_python_status( envname = "downballotR", required_pkgs = db_required_python_packages(), quiet = FALSE )
envname |
Name of the virtualenv to check. |
required_pkgs |
Character vector of required Python packages. Defaults
to |
quiet |
If |
This function does not modify the environment.
An object of class downballot_python_status. Invisibly when
quiet = FALSE.
Pins reticulate to the package's virtualenv for the current R session. If reticulate is already initialized to a different interpreter, this errors with a clear message (reticulate cannot switch interpreters mid-session).
downballot_use_python(envname = "downballotR")downballot_use_python(envname = "downballotR")
envname |
Name of the virtualenv to use. |
Invisibly TRUE on success.
Print a downballot_python_status object
## S3 method for class 'downballot_python_status' print(x, ...)## S3 method for class 'downballot_python_status' print(x, ...)
x |
A |
... |
Further arguments passed to or from other methods (unused). |
Invisibly returns x, the downballot_python_status
object passed in, following the S3 print method convention.
A single entry point that automatically routes to the appropriate scraper
based on state. Use db_list_states("election_stats")
to see states supported by the general-election scraper.
scrape_elections( state = NULL, year_from = NULL, year_to = NULL, level = c("all", "state", "county", "precinct", "town", "parish"), parallel = TRUE, max_workers = 4L, include_vote_methods = FALSE )scrape_elections( state = NULL, year_from = NULL, year_to = NULL, level = c("all", "state", "county", "precinct", "town", "parish"), parallel = TRUE, max_workers = 4L, include_vote_methods = FALSE )
state |
State name or 2-letter abbreviation, accepted in any case or
spacing style (e.g. |
year_from |
Start year, inclusive (default |
year_to |
End year, inclusive (default |
level |
Constituency (geographic reporting) level of the returned
results — i.e., the spatial unit at which votes are tabulated. Each value
corresponds to a constituency:
|
parallel |
( |
max_workers |
(Georgia / Utah / Connecticut / Louisiana) Maximum number
of parallel Chromium browsers (default |
include_vote_methods |
(Georgia only) If |
Routing rules (applied in order):
state matches North Carolina (e.g. "NC",
"north_carolina") → NC State Board of Elections scraper (2000–present).
state matches Connecticut (e.g. "CT",
"connecticut") → Connecticut CTEMS scraper (2016–present).
state matches Georgia (e.g. "GA",
"georgia") → Georgia Secretary of State scraper (2000–present).
state matches Utah (e.g. "UT", "utah") →
Utah election results scraper (2023–present).
state matches Indiana (e.g. "IN", "indiana") →
Indiana General Election results scraper (2019–present).
state matches Louisiana (e.g. "LA", "louisiana") →
Louisiana Secretary of State scraper (1982–present).
All other states → ElectionStats multi-state scraper.
A data.frame, or a named list when level = "all":
$state + $county (+ $precinct when available) for ElectionStats;
$state + $county + $precinct for Georgia / Utah (or just $state / $county / $precinct alone when the corresponding level is specified); $state + $county for Indiana;
$state + $town for Connecticut;
$state + $parish for Louisiana;
$precinct + $county + $state for North Carolina.
Each component is also assigned directly into the calling environment
(e.g. ga_state, ga_county) when level = "all".
# General election results — Virginia df <- scrape_elections(state = "virginia", year_from = 2023, year_to = 2023, level = "state") # General election results — Virginia, both state and county levels res <- scrape_elections(state = "virginia", year_from = 2023, year_to = 2023) res$state # candidate-level data frame res$county # county vote breakdown data frame # North Carolina — single year df <- scrape_elections(state = "NC", year_from = 2024, year_to = 2024) # Connecticut — statewide + town results for 2024 res <- scrape_elections(state = "CT", year_from = 2024, year_to = 2024) res$state # statewide totals res$town # town-level results # Connecticut — statewide only (faster; no town scraping) df <- scrape_elections(state = "CT", year_from = 2024, year_to = 2024, level = "state") # Connecticut — with more parallel workers res <- scrape_elections(state = "CT", year_from = 2022, year_to = 2022, max_workers = 4L) # Georgia — statewide + county results res <- scrape_elections(state = "GA", year_from = 2024, year_to = 2024) # Georgia — statewide only (faster) df <- scrape_elections(state = "GA", year_from = 2024, year_to = 2024, level = "state") # Georgia — with vote-method breakdown res <- scrape_elections(state = "GA", year_from = 2024, year_to = 2024, include_vote_methods = TRUE) # Utah — statewide + county results res <- scrape_elections(state = "UT", year_from = 2024, year_to = 2024) # Indiana — General Election results (statewide + county) res <- scrape_elections(state = "IN", year_from = 2024, year_to = 2024) res$state # statewide candidate totals res$county # county-level breakdown # Indiana — statewide only (faster) df <- scrape_elections(state = "IN", year_from = 2022, year_to = 2022, level = "state") # Louisiana — statewide + parish results res <- scrape_elections(state = "LA", year_from = 2024, year_to = 2024) res$state # statewide candidate totals res$parish # parish-level breakdown # Louisiana — statewide only (faster; skips parish scraping) df <- scrape_elections(state = "LA", year_from = 2023, year_to = 2023, level = "state")# General election results — Virginia df <- scrape_elections(state = "virginia", year_from = 2023, year_to = 2023, level = "state") # General election results — Virginia, both state and county levels res <- scrape_elections(state = "virginia", year_from = 2023, year_to = 2023) res$state # candidate-level data frame res$county # county vote breakdown data frame # North Carolina — single year df <- scrape_elections(state = "NC", year_from = 2024, year_to = 2024) # Connecticut — statewide + town results for 2024 res <- scrape_elections(state = "CT", year_from = 2024, year_to = 2024) res$state # statewide totals res$town # town-level results # Connecticut — statewide only (faster; no town scraping) df <- scrape_elections(state = "CT", year_from = 2024, year_to = 2024, level = "state") # Connecticut — with more parallel workers res <- scrape_elections(state = "CT", year_from = 2022, year_to = 2022, max_workers = 4L) # Georgia — statewide + county results res <- scrape_elections(state = "GA", year_from = 2024, year_to = 2024) # Georgia — statewide only (faster) df <- scrape_elections(state = "GA", year_from = 2024, year_to = 2024, level = "state") # Georgia — with vote-method breakdown res <- scrape_elections(state = "GA", year_from = 2024, year_to = 2024, include_vote_methods = TRUE) # Utah — statewide + county results res <- scrape_elections(state = "UT", year_from = 2024, year_to = 2024) # Indiana — General Election results (statewide + county) res <- scrape_elections(state = "IN", year_from = 2024, year_to = 2024) res$state # statewide candidate totals res$county # county-level breakdown # Indiana — statewide only (faster) df <- scrape_elections(state = "IN", year_from = 2022, year_to = 2022, level = "state") # Louisiana — statewide + parish results res <- scrape_elections(state = "LA", year_from = 2024, year_to = 2024) res$state # statewide candidate totals res$parish # parish-level breakdown # Louisiana — statewide only (faster; skips parish scraping) df <- scrape_elections(state = "LA", year_from = 2023, year_to = 2023, level = "state")
Computes aggregate statistics for a data frame of election results. The
state is detected automatically from the state column when present,
or from the variable name (e.g. ga_results -> "Georgia").
summarize_results(df, state = NULL)summarize_results(df, state = NULL)
df |
A data frame returned by |
state |
Optional two-letter state abbreviation or full state name. Overrides auto-detection when supplied. |
A named list (printed on call) with:
stateDetected or supplied state name.
yearsInteger vector of election years present.
n_yearsNumber of distinct election years.
n_electionsNumber of distinct elections.
n_candidatesNumber of distinct candidate names.
office_level_breakdownNamed integer vector: distinct
elections by office level (Federal / State / Local). Note: this is
office level, not the constituency level argument used by
scrape_elections.
offices_by_levelNamed list: distinct office names per office level.
ga_results <- scrape_elections("GA", 2020, 2024) summarize_results(ga_results)ga_results <- scrape_elections("GA", 2020, 2024) summarize_results(ga_results)