working on explanation

2026-06-02 12:25:13 +02:00 · 2026-06-02 12:25:13 +02:00 · 5a991f1e0e
commit 5a991f1e0e
parent 5ff3d3f31f
1 changed files with 11 additions and 5 deletions
--- a/data/processed/proportions_CA_table.md
+++ b/data/processed/proportions_CA_table.md
@ -21,9 +21,15 @@ Every other column is a variable used in the CA. Variables are grouped into eigh
 | `migration`  | 2             | Inmigrations, outmigrations.                                                                    |
 | `demography` | 2             | Number of retirees, number of localities. Pooled into one block so block normalisation can run. |
 #### A note on the demography block
 Pooling *retirees* and *localities* is partly a technical workaround. Block normalisation needs at least two variables in a block, otherwise the rescaling collapses every row to the same constant and the variable carries no information.
 Both variables describe **how the population is distributed**: retirees say something about *who lives there* (ageing concentration), localities say something about *where they live* (how many separate settlements the municipality has). The contrast the block ends up encoding—retirees relative to localities—is in effect "people-per-settlement vs spread-thin-across-settlements". A municipality with many retirees per locality reads as an ageing population concentrated in a few settlements; one with many localities per retiree reads as a population spread thinly across small ones. That contrast lines up with the urban–rural gradient the analysis is built to detect.
 ### Supplementary blocks (2)
-| Block         | Variables (n) | Content                                                                                                                                                                                       |
+| Block         | Variables (n) | What it captures                                                                                                                                                                              |
 | ------------- | ------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
 | `provision` | up to 33      | Counts of educational institution units by type (preschool, primary, secondary, adult, HE) × {total, public, private}. Some columns with no observations anywhere are dropped automatically. |
 | `opinion`   | 9             | Survey-based satisfaction with preschool, elementary school and high school (bad / mid / good shares).                                                                                        |
@ -34,7 +40,7 @@ The pipeline lives in `src/municipalities/04-proportions_CA.R`. The table you se
 ### 1. Backfilled 2022 cross-section
-For every variable, the value used is the 2022 figure if available; otherwise the most recent earlier figure for that municipality (priority: 2020-2021 window, then census-closest, then discontinued-last). This mirrors the sampling logic of `01-sampling.R`.
+For every variable, the value used is the 2022 figure if available; otherwise the most recent earlier figure for that municipality (priority: 2020-2021 window, then census-closest, then discontinued-last).
 ### 2. Block normalisation
@ -43,7 +49,7 @@ Inside each *active* block (and the opinion block), every municipality's row is
 - every municipality contributes the same row mass to the CA (no size effect), and
 - every block contributes the same total weight (no block dominates because its raw counts are bigger).
-So the value in, say, the `Upper secondary` column for Upplands Väsby (435.2) reads as "435.2 of 1000 within the education block for that municipality" — i.e. ~43.5% of the municipality's educated population sits at upper-secondary level.
+So the value in, say, the `Upper secondary` column for Upplands Väsby (435.2) reads as "435.2 of 1000 within the education block for that municipality"; i.e. ~43.5% of the municipality's educated population has attained upper-secondary level.
 ### 3. Provision rescaled as a per-capita rate
@ -51,8 +57,8 @@ The supplementary `provision` columns are *not* block-normalised. Instead each c
 ### 4. Renaming, drop-empty
-Columns are renamed to short readable labels (see `proportions_CA_table_columns.csv` for the mapping). Any column that is zero in every municipality (e.g. an institution type with no private units anywhere) is dropped — it carries no information and would break the CA's supplementary projection.
+Columns are renamed to short readable labels (see `proportions_CA_table_columns.csv` for the mapping). Any column that is zero in every municipality (e.g. an institution type with no private units anywhere) is dropped.
 ## How to read a row
-A row is a municipality's *profile* across all blocks. Within each active block the values sum to 1000 and can be read as per-mille shares; provision values are rates per 100 000 inhabitants; opinion values are also normalised to a per-1000 share within their preschool/elementary/highschool triples. Across blocks the values aren't comparable as raw numbers — that's the whole point of block normalisation: each block is comparable to *itself* across municipalities, not to the other blocks.
+A row is a municipality's *profile* across all blocks. Within each active block the values sum to 1000 and can be read as per-mille shares; provision values are rates per 100 000 inhabitants; opinion values are also normalised to a per-1000 share within their preschool/elementary/highschool triples. Across blocks the values aren't comparable as raw numbers; each block is comparable to *itself* across municipalities, not to the other blocks.