Package 'auscensus'

Title: Access Australian Census Data (2006-2021)
Description: R package to interact with Australian Census Data Packs,providing an interface to extract data across multiple censuses.
Authors: Carlos Yáñez Santibáñez [aut, cre], Craig Alexander [ths], Kyle Walker [cph], Australian Bureau of Statistics [cph]
Maintainer: Carlos Yáñez Santibáñez <[email protected]>
License: MIT + file LICENSE
Version: 0.0.1.0011
Built: 2024-11-13 06:16:08 UTC
Source: https://github.com/carlosyanez/auscensus

Help Index


Helper function to convert attributes to list

Description

Little helper function that converts tibble into a list with vectors, which is the expected attributes input for get_census_summary()

Usage

attribute_tibble_to_list(df, original = colnames(df)[1], new = colnames(df)[2])

Arguments

df

tibble/data.frame. First column is the original value, the second the new label

original

name of original attribute

new

new naming

Value

list object

Examples

## Not run: 
attributes <- tribble(~Census_stat, ~ Group,
        "Age_years_60_males","60 year old male",
         "Age (Years): 60_males","60 year old male",
         "Age_years_60_males","60 year old female",
         "Age (Years): 60_females","60 year old female")
         attribute_tibble_to_list(attributes)


## End(Not run)

Helper function to convert attributes to list

Description

Little helper function that converts tibble into a list with vectors, which is the expected attributes input for get_census_summary()

Usage

calculate_percentage(
  df,
  key_col,
  value_col,
  key_value = "Total",
  percentage_scale = 1
)

Arguments

df

data frame

key_col

name of the column containing the "Total Label"

value_col

name of the column containing values

key_value

total label

percentage_scale

1 if percentage to be presented in scale 0-1, or 100 to be shown as 0%-100%

Value

list object


Helper function to download data

Description

Helper function to download data

Usage

census_datapacks()

Value

nothing


Helper function to update/download data

Description

Helper function to update/download data

Usage

data_census_delete(file = NULL)

Arguments

file

to delete - defaults to all of them (provide full path, can obtain from data_census_info)

Value

nothing


Helper function to download data

Description

Helper function to download data

Usage

data_census_download(
  download_dir,
  census_year = NULL,
  download_method = "wget"
)

Arguments

download_dir

Full path where to download census files (required)

census_year

census year to download (default to all)

download_method

method to pass to download.file() ("wget" as default)

Value

nothing


Helper function to update/download data

Description

Helper function to update/download data

Usage

data_census_import(file)

Arguments

file

file to import to the cache

Value

nothing


Helper function to update/download data

Description

Helper function to update/download data

Usage

data_census_info()

Value

nothing


Helper function to find cache folder

Description

Helper function to find cache folder

Usage

find_census_cache()

Value

nothing


Get census data.

Description

This function extracts table files from each data pack (given tables and geo structure), and will collate them together into a list(), which it will return. By default it will save the processed tables in the cache folder (in parquet files), which it will use on subsquent calls.

Usage

get_census_data(
  census_table,
  geo_structure,
  selected_years = list_census_years(),
  ignore_cache = FALSE,
  collect_data = FALSE,
  attr = NULL
)

Arguments

census_table

list of tables, in the format of the output of list_census_tables()

geo_structure

vector with strings of geo structures (e.g. SA1,LGA,CED)

selected_years

years to filter

ignore_cache

If TRUE, it will ignore cached files

collect_data

if TRUE will return data. if FALSE (default) , it will return arrow bindings to cached files

attr

attributes to filter on, presented as a character vector (e.g c("Age_years_60_males","Age_years_60_females"))

Value

data frame with data from file, filtered by division and election year

Examples

## Not run: 
data <- get_census_data(census_table = list_census_tables("04"),
                        geo_structure = "LGA")

names(data)

## End(Not run)

Get a summary of one or a collection of statistics across Censuses

Description

This function allows to produce a summary of one or many statistics across censuses. Results are presented in a simple summary table. The function allows to present individual statistics or an aggregation of several statistics (e.g. aggregate number of births by country to present a continental total). If the name statistic containing totals is provided, the function has an option to calculate percentages (presented either in 0-1 or 0-100 scale).

Usage

get_census_summary(
  table_number = NULL,
  geo_structure = NULL,
  attribute,
  geo_unit_names = NULL,
  geo_unit_codes = NULL,
  selected_years = list_census_years(),
  reference_total = NULL,
  percentage_scale = 1,
  ignore_cache = FALSE,
  data_source = NULL,
  data_collected = FALSE,
  census_table = NULL
)

Arguments

table_number

number of selected table

geo_structure

character presenting the geographical structure to present stats (e.g. SA1,LGA,CED)

attribute

list with vectors of statistics to be summarise. Each vector element will be aggregated and presented under the item's name, e.g. list("60 year old male"=c("Age_years_60_males","Age (Years): 60_males"),

geo_unit_names

vector with names of the geographic structures to present. They need to correspond with geo_structure, e.g. if geo_structure="LGA", acceptable values could be c("Melbourne","Stonnington","Yarra"). If both this and geo_unit_codes are null, it will present all avaialable elements.

geo_unit_codes

vector with ABS codes of the geo structures to present. Similar to geo_units_names.

selected_years

vector with selected years to display.

reference_total

Optional. List containing the names of all statistics representing totals, e.g. list("Total"=c("Total_persons")

percentage_scale

1 if percentage to be presented in scale 0-1, or 100 to be shown as 0%-100%

ignore_cache

If TRUE, it will ignore cached files

data_source

result of get_census_data (will ignore other parameters if this is provided)

data_collected

TRUE if data_source is a dataset, FALSE if is a DB,arrow binding

census_table

Instead of using a table number, this allows for a more complex filter table, e.g. containing different table numbers. Expected format matches the output of list_census_table().

Value

data frame with data from file, filtered by division and election year

Examples

## Not run: 
get_census_summary(table_number = "04",
    attribute = list("60 year old male"=c("Age_years_60_males","Age (Years): 60_males"),
                      "60 year old female"=c("Age_years_60_males","Age (Years): 60_females")),
    geo_unit_names = c("Melbourne","Stonnington","Yarra"),
    reference_total = list("Total"=c("Total_persons")),


## End(Not run)

Get names of attributes for a given census tables, across all time

Description

Get list of available geopgrahies, filterable by type and name.

Usage

list_census_attributes(number = NULL, attribute_regex = NULL)

Arguments

number

vector containing one or more table numbers

attribute_regex

string with a regular expression to filter attribute names

Value

tibble, showing the geo type, available for each year

Examples

## Not run: 
# Get list of all divisions
list_census_attributes()
 
## End(Not run)

Get census geographies, filterable

Description

Get list of available geographies, filterable by type and name.

Usage

list_census_geo(geo_types = NULL, geo_names = NULL, geo_name_regex = NULL)

Arguments

geo_types

vector containing one or more geography types (i.e. "STE","CED","SA1" ). NULL by default.

geo_names

vector containing one or more geography names (i.e. "Melbourne", "Yarra","Stonnington" for LGAs). NULL by default.

geo_name_regex

string with a regular expression to filter geograhpy names (i.e. for all elements starting with M : "$M")

Value

tibble, showing the geo type, available for each year

Examples

## Not run: 
# Get list of all Commonwealth electoral divisions and Local Government Areas that start with "Mel"
list_census_geo(geo_types=c("CED","LGA"),geo_name_regex="^Mel")
 
## End(Not run)

List if a geo structure is available for a particular table , in a particular year

Description

List if a geo structure is available for a particular table , in a particular year

Usage

list_census_geo_tables(year = NULL, geo = NULL, table_number = NULL)

Arguments

year

vector with years

geo

vector with geo structure

table_number

table number

Value

tibble

Examples

## Not run: 
# Get list of all divisions
list_census_geo()
 
## End(Not run)

Get Geography types.

Description

Very simple function listing geography types (e.g. SAx, CED, etc.), for which data pack has been imported.

Usage

list_census_geo_types()

Value

tibble, showing the geotype, available for each year

Examples

## Not run: 
# Get list of all divisions
list_census_geo_types()
 
## End(Not run)

Get census geographies, filterable

Description

Get list of available geopgrahies, filterable by type and name.

Usage

list_census_tables(number = NULL, table_name_regex = NULL)

Arguments

number

vector containing one or more table numbers

table_name_regex

string with a regular expression to filter table names (i.e. for all elements containing with Country : "Country")

Value

tibble, showing the geo type, available for each year

Examples

## Not run: 
# Get list of all divisions
list_census_geo()
 
## End(Not run)

Get Census years.

Description

Very simple function listing the Census years included in this package, for which data pack has been imported.

Usage

list_census_years(mode = "available")

Arguments

mode

Either "listed" or "available

Value

vector with years

Examples

## Not run: 
# Get list of all divisions
list_census_years()
 
## End(Not run)

Helper function to delete all csv in cache

Description

Helper function to delete all csv in cache

Usage

remove_census_cache_csv()

Value

nothing