working on explanation
This commit is contained in:
parent
5ff3d3f31f
commit
5a991f1e0e
1 changed files with 11 additions and 5 deletions
|
|
@ -21,9 +21,15 @@ Every other column is a variable used in the CA. Variables are grouped into eigh
|
||||||
| `migration` | 2 | Inmigrations, outmigrations. |
|
| `migration` | 2 | Inmigrations, outmigrations. |
|
||||||
| `demography` | 2 | Number of retirees, number of localities. Pooled into one block so block normalisation can run. |
|
| `demography` | 2 | Number of retirees, number of localities. Pooled into one block so block normalisation can run. |
|
||||||
|
|
||||||
|
#### A note on the demography block
|
||||||
|
|
||||||
|
Pooling *retirees* and *localities* is partly a technical workaround. Block normalisation needs at least two variables in a block, otherwise the rescaling collapses every row to the same constant and the variable carries no information.
|
||||||
|
|
||||||
|
Both variables describe **how the population is distributed**: retirees say something about *who lives there* (ageing concentration), localities say something about *where they live* (how many separate settlements the municipality has). The contrast the block ends up encoding—retirees relative to localities—is in effect "people-per-settlement vs spread-thin-across-settlements". A municipality with many retirees per locality reads as an ageing population concentrated in a few settlements; one with many localities per retiree reads as a population spread thinly across small ones. That contrast lines up with the urban–rural gradient the analysis is built to detect.
|
||||||
|
|
||||||
### Supplementary blocks (2)
|
### Supplementary blocks (2)
|
||||||
|
|
||||||
| Block | Variables (n) | Content |
|
| Block | Variables (n) | What it captures |
|
||||||
| ------------- | ------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
| ------------- | ------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||||
| `provision` | up to 33 | Counts of educational institution units by type (preschool, primary, secondary, adult, HE) × {total, public, private}. Some columns with no observations anywhere are dropped automatically. |
|
| `provision` | up to 33 | Counts of educational institution units by type (preschool, primary, secondary, adult, HE) × {total, public, private}. Some columns with no observations anywhere are dropped automatically. |
|
||||||
| `opinion` | 9 | Survey-based satisfaction with preschool, elementary school and high school (bad / mid / good shares). |
|
| `opinion` | 9 | Survey-based satisfaction with preschool, elementary school and high school (bad / mid / good shares). |
|
||||||
|
|
@ -34,7 +40,7 @@ The pipeline lives in `src/municipalities/04-proportions_CA.R`. The table you se
|
||||||
|
|
||||||
### 1. Backfilled 2022 cross-section
|
### 1. Backfilled 2022 cross-section
|
||||||
|
|
||||||
For every variable, the value used is the 2022 figure if available; otherwise the most recent earlier figure for that municipality (priority: 2020-2021 window, then census-closest, then discontinued-last). This mirrors the sampling logic of `01-sampling.R`.
|
For every variable, the value used is the 2022 figure if available; otherwise the most recent earlier figure for that municipality (priority: 2020-2021 window, then census-closest, then discontinued-last).
|
||||||
|
|
||||||
### 2. Block normalisation
|
### 2. Block normalisation
|
||||||
|
|
||||||
|
|
@ -43,7 +49,7 @@ Inside each *active* block (and the opinion block), every municipality's row is
|
||||||
- every municipality contributes the same row mass to the CA (no size effect), and
|
- every municipality contributes the same row mass to the CA (no size effect), and
|
||||||
- every block contributes the same total weight (no block dominates because its raw counts are bigger).
|
- every block contributes the same total weight (no block dominates because its raw counts are bigger).
|
||||||
|
|
||||||
So the value in, say, the `Upper secondary` column for Upplands Väsby (435.2) reads as "435.2 of 1000 within the education block for that municipality" — i.e. ~43.5% of the municipality's educated population sits at upper-secondary level.
|
So the value in, say, the `Upper secondary` column for Upplands Väsby (435.2) reads as "435.2 of 1000 within the education block for that municipality"; i.e. ~43.5% of the municipality's educated population has attained upper-secondary level.
|
||||||
|
|
||||||
### 3. Provision rescaled as a per-capita rate
|
### 3. Provision rescaled as a per-capita rate
|
||||||
|
|
||||||
|
|
@ -51,8 +57,8 @@ The supplementary `provision` columns are *not* block-normalised. Instead each c
|
||||||
|
|
||||||
### 4. Renaming, drop-empty
|
### 4. Renaming, drop-empty
|
||||||
|
|
||||||
Columns are renamed to short readable labels (see `proportions_CA_table_columns.csv` for the mapping). Any column that is zero in every municipality (e.g. an institution type with no private units anywhere) is dropped — it carries no information and would break the CA's supplementary projection.
|
Columns are renamed to short readable labels (see `proportions_CA_table_columns.csv` for the mapping). Any column that is zero in every municipality (e.g. an institution type with no private units anywhere) is dropped.
|
||||||
|
|
||||||
## How to read a row
|
## How to read a row
|
||||||
|
|
||||||
A row is a municipality's *profile* across all blocks. Within each active block the values sum to 1000 and can be read as per-mille shares; provision values are rates per 100 000 inhabitants; opinion values are also normalised to a per-1000 share within their preschool/elementary/highschool triples. Across blocks the values aren't comparable as raw numbers — that's the whole point of block normalisation: each block is comparable to *itself* across municipalities, not to the other blocks.
|
A row is a municipality's *profile* across all blocks. Within each active block the values sum to 1000 and can be read as per-mille shares; provision values are rates per 100 000 inhabitants; opinion values are also normalised to a per-1000 share within their preschool/elementary/highschool triples. Across blocks the values aren't comparable as raw numbers; each block is comparable to *itself* across municipalities, not to the other blocks.
|
||||||
|
|
|
||||||
Loading…
Add table
Reference in a new issue