Current .csv file - v0.1

Changelog

  • v0.1
    • First public release

Citation

Sonnet, Luke. 2019. “2018 Pakistani General Election Polling Station Data.” https://doi.org/10.17605/osf.io/mtsnd.

Automatically generated codebook

Metadata

Description

Dataset name: 2018 Pakistani General Elections Polling Station Level Data (long)

Overview

These data are the official form 48s (polling station level returns by candidate), and are a reshaped version of the wide polling station data. They are made long so that each row is a polling station-candidate. For more thorough documentation of the data, please see the wide data page.

Variable summary

Identifiers:

  • constituency_id
  • constituency_ps_id: the pasted together constituency_id and ps_id, a unique identifier of the constituency-polling station

Polling station level data:

  • ps_total_votes: reported number of invalid and valid votes on form 48
  • ps_invalid_votes: reported number of invalid votes on form 48
  • ps_valid_votes: reported number of valid votes on form 48
  • ps_valid_votes_summed: the summed number of candidate_votes in the polling station. This may differ from ps_valid_votes if there are errors in any of the candidate totals or in the ps_valid_votes field

Candidate level data:

  • candidate_id: id number, unique to the consituency, that matches the column number in the wide data
  • candidate_name
  • candidate_party
  • candidate_votes: the number of valid votes cast for this candidate at this polling station
  • candidate_valid_share: the candidates vote share of the reported total number of polling station valid votes (candidate_votes / ps_valid_votes)
  • candidate_valid_share_of_summed: the candidates vote share of the summed candidate-level valid votes (candidate_votes / ps_valid_votes_summed)

Other data:

  • candidate_total_valid_votes_summed: the constituency-level sum of candidate_votes for a candidate

  • candidate_total_valid_votes_polled_49: the constituency-level total votes polled by this candidate as reported on the constituency-level form 49s. This may differ from candidate_total_valid_votes_summed if some polling stations are missing or there are errors in any of the polling station level candidate vote totals.

  • n_candidates: the number of candidates competing in the constituency

Metadata for search engines

  • keywords: constituency_id, constituency_ps_id, candidate_id, candidate_name, candidate_party, candidate_votes, candidate_valid_share, candidate_valid_share_of_summed, candidate_total_valid_votes_summed, candidate_total_valid_votes_polled_49, ps_invalid_votes, ps_valid_votes, ps_total_votes, ps_valid_votes_summed and n_candidates

Variables

constituency_id

Distribution

0 missing values.

Summary statistics
name data_type missing complete n empty n_unique min max
constituency_id character 0 2052980 2052980 0 792 3 5

constituency_ps_id

Distribution

0 missing values.

Summary statistics
name data_type missing complete n empty n_unique min max
constituency_ps_id character 0 2052980 2052980 0 156114 5 9

candidate_id

Distribution

0 missing values.

Summary statistics
name data_type missing complete n mean sd p0 p25 p50 p75 p100 hist
candidate_id integer 0 2052980 2052980 8.14 5.64 1 4 7 11 41 ▇▆▃▁▁▁▁▁

candidate_name

Distribution

0 missing values.

Summary statistics
name data_type missing complete n empty n_unique min max
candidate_name character 0 2052980 2052980 0 8232 4 48

candidate_party

Distribution

0 missing values.

Summary statistics
name data_type missing complete n empty n_unique min max
candidate_party character 0 2052980 2052980 0 95 11 56

candidate_votes

Distribution

3811 missing values.

Summary statistics
name data_type missing complete n mean sd p0 p25 p50 p75 p100 hist
candidate_votes integer 3811 2049169 2052980 47.95 106.13 0 0 2 27 1803 ▇▁▁▁▁▁▁▁

candidate_valid_share

Distribution

29828 missing values.

Summary statistics
name data_type missing complete n mean sd p0 p25 p50 p75 p100 hist
candidate_valid_share numeric 29828 2023152 2052980 0.076 0.16 0 0 0.0038 0.047 53.25 ▇▁▁▁▁▁▁▁

candidate_valid_share_of_summed

Distribution

32130 missing values.

Summary statistics
name data_type missing complete n mean sd p0 p25 p50 p75 p100 hist
candidate_valid_share_of_summed numeric 32130 2020850 2052980 0.076 0.16 0 0 0.0039 0.047 1 ▇▁▁▁▁▁▁▁

candidate_total_valid_votes_summed

Distribution

53339 missing values.

Summary statistics
name data_type missing complete n mean sd p0 p25 p50 p75 p100 hist
candidate_total_valid_votes_summed integer 53339 1999641 2052980 11795.6 25111.36 0 202 1024 8160 172609 ▇▁▁▁▁▁▁▁

candidate_total_valid_votes_polled_49

Distribution

0 missing values.

Summary statistics
name data_type missing complete n mean sd p0 p25 p50 p75 p100 hist
candidate_total_valid_votes_polled_49 integer 0 2052980 2052980 11390.08 24594.08 -99 179 951 7657 173125 ▇▁▁▁▁▁▁▁

ps_invalid_votes

Distribution

71340 missing values.

Summary statistics
name data_type missing complete n mean sd p0 p25 p50 p75 p100 hist
ps_invalid_votes integer 71340 1981640 2052980 20.57 18.77 0 9 16 27 1016 ▇▁▁▁▁▁▁▁

ps_valid_votes

Distribution

26625 missing values.

Summary statistics
name data_type missing complete n mean sd p0 p25 p50 p75 p100 hist
ps_valid_votes integer 26625 2026355 2052980 620.99 244.29 0 450 605 777 2028 ▁▆▇▅▁▁▁▁

ps_total_votes

Distribution

71306 missing values.

Summary statistics
name data_type missing complete n mean sd p0 p25 p50 p75 p100 hist
ps_total_votes integer 71306 1981674 2052980 640.95 248.63 0 468 625 799 2053 ▁▆▇▅▁▁▁▁

ps_valid_votes_summed

Distribution

30181 missing values.

Summary statistics
name data_type missing complete n mean sd p0 p25 p50 p75 p100 hist
ps_valid_votes_summed integer 30181 2022799 2052980 622.51 243.97 0 452 606 778 2028 ▁▆▇▅▁▁▁▁

n_candidates

Distribution

0 missing values.

Summary statistics
name data_type missing complete n mean sd p0 p25 p50 p75 p100 hist
n_candidates integer 0 2052980 2052980 15.27 6.12 4 11 14 18 41 ▂▇▆▃▂▁▁▁

Codebook table

JSON-LD metadata The following JSON-LD can be found by search engines, if you share this codebook publicly on the web.

{
  "name": "2018 Pakistani General Elections Polling Station Level Data (long)",
  "description": "\n### Overview\n\nThese data are the official form 48s (polling station level returns by candidate), and are a reshaped version of the wide polling station data. They are made long so that each row is a polling station-candidate. For more thorough documentation of the data, please see the wide data page.\n\n### Variable summary\n\nIdentifiers:\n\n* constituency_id\n* constituency_ps_id: the pasted together constituency_id and ps_id, a unique identifier of the constituency-polling station\n\nPolling station level data:\n\n* ps_total_votes: reported number of invalid and valid votes on form 48\n* ps_invalid_votes: reported number of invalid votes on form 48\n* ps_valid_votes: reported number of valid votes on form 48\n* ps_valid_votes_summed: the summed number of `candidate_votes` in the polling station. This may differ from `ps_valid_votes` if there are errors in any of the candidate totals or in the `ps_valid_votes` field\n\nCandidate level data:\n\n* candidate_id: id number, unique to the consituency, that matches the column number in the wide data\n* candidate_name\n* candidate_party\n* candidate_votes: the number of valid votes cast for this candidate at this polling station\n* candidate_valid_share: the candidates vote share of the reported total number of polling station valid votes (`candidate_votes / ps_valid_votes`)\n* candidate_valid_share_of_summed: the candidates vote share of the summed candidate-level valid votes (`candidate_votes / ps_valid_votes_summed`)\n\nOther data:\n\n* candidate_total_valid_votes_summed: the constituency-level sum of `candidate_votes` for a candidate \n* candidate_total_valid_votes_polled_49: the constituency-level total votes polled by this candidate as reported on the constituency-level form 49s. This may differ from `candidate_total_valid_votes_summed` if some polling stations are missing or there are errors in any of the polling station level candidate vote totals.\n\n* n_candidates: the number of candidates competing in the constituency\n\n\n\n## Table of variables\nThis table contains variable names, labels, their central tendencies and other attributes.\n\n|name                                  |data_type |missing |complete |n       |empty |n_unique |min |max |mean     |sd       |p0  |p25 |p50    |p75   |p100   |hist     |\n|:-------------------------------------|:---------|:-------|:--------|:-------|:-----|:--------|:---|:---|:--------|:--------|:---|:---|:------|:-----|:------|:--------|\n|constituency_id                       |character |0       |2052980  |2052980 |0     |792      |3   |5   |NA       |NA       |NA  |NA  |NA     |NA    |NA     |NA       |\n|constituency_ps_id                    |character |0       |2052980  |2052980 |0     |156114   |5   |9   |NA       |NA       |NA  |NA  |NA     |NA    |NA     |NA       |\n|candidate_id                          |integer   |0       |2052980  |2052980 |NA    |NA       |NA  |NA  |8.14     |5.64     |1   |4   |7      |11    |41     |▇▆▃▁▁▁▁▁ |\n|candidate_name                        |character |0       |2052980  |2052980 |0     |8232     |4   |48  |NA       |NA       |NA  |NA  |NA     |NA    |NA     |NA       |\n|candidate_party                       |character |0       |2052980  |2052980 |0     |95       |11  |56  |NA       |NA       |NA  |NA  |NA     |NA    |NA     |NA       |\n|candidate_votes                       |integer   |3811    |2049169  |2052980 |NA    |NA       |NA  |NA  |47.95    |106.13   |0   |0   |2      |27    |1803   |▇▁▁▁▁▁▁▁ |\n|candidate_valid_share                 |numeric   |29828   |2023152  |2052980 |NA    |NA       |NA  |NA  |0.076    |0.16     |0   |0   |0.0038 |0.047 |53.25  |▇▁▁▁▁▁▁▁ |\n|candidate_valid_share_of_summed       |numeric   |32130   |2020850  |2052980 |NA    |NA       |NA  |NA  |0.076    |0.16     |0   |0   |0.0039 |0.047 |1      |▇▁▁▁▁▁▁▁ |\n|candidate_total_valid_votes_summed    |integer   |53339   |1999641  |2052980 |NA    |NA       |NA  |NA  |11795.6  |25111.36 |0   |202 |1024   |8160  |172609 |▇▁▁▁▁▁▁▁ |\n|candidate_total_valid_votes_polled_49 |integer   |0       |2052980  |2052980 |NA    |NA       |NA  |NA  |11390.08 |24594.08 |-99 |179 |951    |7657  |173125 |▇▁▁▁▁▁▁▁ |\n|ps_invalid_votes                      |integer   |71340   |1981640  |2052980 |NA    |NA       |NA  |NA  |20.57    |18.77    |0   |9   |16     |27    |1016   |▇▁▁▁▁▁▁▁ |\n|ps_valid_votes                        |integer   |26625   |2026355  |2052980 |NA    |NA       |NA  |NA  |620.99   |244.29   |0   |450 |605    |777   |2028   |▁▆▇▅▁▁▁▁ |\n|ps_total_votes                        |integer   |71306   |1981674  |2052980 |NA    |NA       |NA  |NA  |640.95   |248.63   |0   |468 |625    |799   |2053   |▁▆▇▅▁▁▁▁ |\n|ps_valid_votes_summed                 |integer   |30181   |2022799  |2052980 |NA    |NA       |NA  |NA  |622.51   |243.97   |0   |452 |606    |778   |2028   |▁▆▇▅▁▁▁▁ |\n|n_candidates                          |integer   |0       |2052980  |2052980 |NA    |NA       |NA  |NA  |15.27    |6.12     |4   |11  |14     |18    |41     |▂▇▆▃▂▁▁▁ |\n\n### Note\nThis dataset was automatically described using the [codebook R package](https://rubenarslan.github.io/codebook/) (version 0.8.1).",
  "identifier": "https://osf.io/mtsnd/",
  "datePublished": "2019-10-21",
  "creator": {
    "@type": "Person",
    "givenName": "Luke",
    "familyName": "Sonnet",
    "email": "luke.sonnet@gmail.com"
  },
  "citation": "Sonnet, Luke. 2019. “2018 Pakistani General Election Polling Station Data.” https://doi.org/10.17605/osf.io/mtsnd.",
  "url": "https://osf.io/mtsnd/",
  "keywords": ["constituency_id", "constituency_ps_id", "candidate_id", "candidate_name", "candidate_party", "candidate_votes", "candidate_valid_share", "candidate_valid_share_of_summed", "candidate_total_valid_votes_summed", "candidate_total_valid_votes_polled_49", "ps_invalid_votes", "ps_valid_votes", "ps_total_votes", "ps_valid_votes_summed", "n_candidates"],
  "@context": "http://schema.org/",
  "@type": "Dataset",
  "variableMeasured": [
    {
      "name": "constituency_id",
      "@type": "propertyValue"
    },
    {
      "name": "constituency_ps_id",
      "@type": "propertyValue"
    },
    {
      "name": "candidate_id",
      "@type": "propertyValue"
    },
    {
      "name": "candidate_name",
      "@type": "propertyValue"
    },
    {
      "name": "candidate_party",
      "@type": "propertyValue"
    },
    {
      "name": "candidate_votes",
      "@type": "propertyValue"
    },
    {
      "name": "candidate_valid_share",
      "@type": "propertyValue"
    },
    {
      "name": "candidate_valid_share_of_summed",
      "@type": "propertyValue"
    },
    {
      "name": "candidate_total_valid_votes_summed",
      "@type": "propertyValue"
    },
    {
      "name": "candidate_total_valid_votes_polled_49",
      "@type": "propertyValue"
    },
    {
      "name": "ps_invalid_votes",
      "@type": "propertyValue"
    },
    {
      "name": "ps_valid_votes",
      "@type": "propertyValue"
    },
    {
      "name": "ps_total_votes",
      "@type": "propertyValue"
    },
    {
      "name": "ps_valid_votes_summed",
      "@type": "propertyValue"
    },
    {
      "name": "n_candidates",
      "@type": "propertyValue"
    }
  ]
}`