Sonnet, Luke. 2019. “2018 Pakistani General Election Polling Station Data.” https://doi.org/10.17605/osf.io/mtsnd.
Dataset name: 2018 Pakistani General Elections Electoral Area and Census Block data
Each row in this data is unique to the combination of constituency, polling station, and census block, and was released by the Electoral Commission of Pakistan as Form 28.
Some polling station areas include several census blocks, and several census blocks are across several polling stations. The mapping of polling station to census block is many-to-many; this means that the same polling Furthermore, female and male only polling stations can have different mappings from polling station to census block.
The ECP released Form 28s separately for the National and Provincial Assembly constituencies. While the delimitation of polling station areas should be identical for the two constituencies, in the data they are not. We report both of them here stacked together.
Note: 714 of the 461290 non-missing block codes actually represent several census blocks. As such, the block_code
field is a string. For these rows, the block_code
field usually has a few block codes pasted together. However, splitting the rows that represent multiple census blocks is difficult, as it is unclear how to divide the voters and booths across census blocks.
While this only affects a small percentage of rows, it could prevent a challenge for those polling stations. If you are having difficulties with this, please leave an issue or email Luke Sonnet.
Throughout, a value of -88
denotes that the data should have been reported but was missing on the forms, either due to a scanning error or some other ommission. Please check the variable distributions before using this data.
constituency_ps_id
pasted with the block_code to generate a unique identifier for each polling station-census block code.constituency_ps_id
, and may not match the ps_name
from the polling station datablock_code
values that have dashes in them. These seem to represent more than one census block code (e.g. “332030301-06-07” seems to represent “332030301”, “332030306”, and “332030307”). Unfortunately this is how the census blocks were reported and we are unable to figure out which voters correspond to which of the census blocks. If you need help creating a full linking between polling stations and each of these block codes, please leave a message.Metadata for search engines
Citation: Sonnet, Luke. 2019. “2018 Pakistani General Election Polling Station Data.” https://doi.org/10.17605/osf.io/mtsnd.
Identifier: https://osf.io/mtsnd/
Date published: 2019-10-21
Creator:
0 missing values.
name | data_type | missing | complete | n | empty | n_unique | min | max |
---|---|---|---|---|---|---|---|---|
constituency_ps_id_block_code | character | 0 | 461320 | 461320 | 0 | 461320 | 11 | 62 |
0 missing values.
name | data_type | missing | complete | n | empty | n_unique | min | max |
---|---|---|---|---|---|---|---|---|
constituency_ps_id | character | 0 | 461320 | 461320 | 0 | 167616 | 5 | 9 |
0 missing values.
name | data_type | missing | complete | n | empty | n_unique | min | max |
---|---|---|---|---|---|---|---|---|
province | character | 0 | 461320 | 461320 | 0 | 4 | 3 | 11 |
0 missing values.
name | data_type | missing | complete | n | empty | n_unique | min | max |
---|---|---|---|---|---|---|---|---|
assembly | character | 0 | 461320 | 461320 | 0 | 2 | 8 | 10 |
0 missing values.
name | data_type | missing | complete | n | empty | n_unique | min | max |
---|---|---|---|---|---|---|---|---|
constituency_id | character | 0 | 461320 | 461320 | 0 | 849 | 3 | 5 |
0 missing values.
name | data_type | missing | complete | n | empty | n_unique | min | max |
---|---|---|---|---|---|---|---|---|
constituency_area | character | 0 | 461320 | 461320 | 0 | 682 | 4 | 54 |
0 missing values.
name | data_type | missing | complete | n | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|---|---|
ps_id | integer | 0 | 461320 | 461320 | 120.13 | 94.13 | 1 | 48 | 98 | 168 | 924 | ▇▃▂▁▁▁▁▁ |
0 missing values.
name | data_type | missing | complete | n | empty | n_unique | min | max |
---|---|---|---|---|---|---|---|---|
ps_name_from_form28 | character | 0 | 461320 | 461320 | 0 | 83096 | 3 | 242 |
30 missing values.
name | data_type | missing | complete | n | empty | n_unique | min | max |
---|---|---|---|---|---|---|---|---|
block_code | character | 30 | 461290 | 461320 | 0 | 163074 | 3 | 52 |
0 missing values.
name | data_type | missing | complete | n | empty | n_unique | min | max |
---|---|---|---|---|---|---|---|---|
block_code_type | character | 0 | 461320 | 461320 | 0 | 3 | 5 | 7 |
170869 missing values.
name | data_type | missing | complete | n | empty | n_unique | min | max |
---|---|---|---|---|---|---|---|---|
name_ea_rural | character | 170869 | 290451 | 461320 | 0 | 52597 | 2 | 244 |
291065 missing values.
name | data_type | missing | complete | n | empty | n_unique | min | max |
---|---|---|---|---|---|---|---|---|
name_ea_urban | character | 291065 | 170255 | 461320 | 0 | 25992 | 1 | 244 |
69248 missing values.
name | data_type | missing | complete | n | empty | n_unique | min | max |
---|---|---|---|---|---|---|---|---|
voter_serials_assigned_to_station | character | 69248 | 392072 | 461320 | 0 | 632 | 1 | 244 |
122 missing values.
name | data_type | missing | complete | n | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|---|---|
male_voters | integer | 122 | 461198 | 461320 | 252.54 | 265.27 | 0 | 0 | 208 | 413 | 3573 | ▇▂▁▁▁▁▁▁ |
162 missing values.
name | data_type | missing | complete | n | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|---|---|
female_voters | integer | 162 | 461158 | 461320 | 199.47 | 214.25 | 0 | 0 | 158 | 327 | 2499 | ▇▂▁▁▁▁▁▁ |
153 missing values.
name | data_type | missing | complete | n | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|---|---|
total_voters | integer | 153 | 461167 | 461320 | 458.75 | 338.62 | 0 | 221 | 389 | 616 | 4341 | ▇▃▁▁▁▁▁▁ |
102 missing values.
name | data_type | missing | complete | n | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|---|---|
male_booths | integer | 102 | 461218 | 461320 | 1.57 | 1.53 | -99 | 0 | 2 | 2 | 5 | ▁▁▁▁▁▁▁▇ |
102 missing values.
name | data_type | missing | complete | n | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|---|---|
female_booths | integer | 102 | 461218 | 461320 | 1.37 | 1.43 | -99 | 0 | 1 | 2 | 5 | ▁▁▁▁▁▁▁▇ |
102 missing values.
name | data_type | missing | complete | n | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|---|---|
total_booths | integer | 102 | 461218 | 461320 | 2.94 | 1.2 | -99 | 2 | 3 | 4 | 6 | ▁▁▁▁▁▁▁▇ |
JSON-LD metadata
The following JSON-LD can be found by search engines, if you share this codebook publicly on the web.
{
"name": "2018 Pakistani General Elections Electoral Area and Census Block data",
"description": "\n### Overview\n\nEach row in this data is unique to the combination of constituency, polling station, and census block, and was released by the Electoral Commission of Pakistan as Form 28.\n\nSome polling station areas include several census blocks, and several census blocks are across several polling stations. The mapping of polling station to census block is many-to-many; this means that the same polling Furthermore, female and male only polling stations can have different mappings from polling station to census block.\n\nThe ECP released Form 28s separately for the National and Provincial Assembly constituencies. While the delimitation of polling station areas should be identical for the two constituencies, in the data they are not. We report both of them here stacked together.\n\nNote: 714 of the 461290 non-missing block codes actually represent several census blocks. As such, the `block_code` field is a string. For these rows, the `block_code` field usually has a few block codes pasted together. However, splitting the rows that represent multiple census blocks is difficult, as it is unclear how to divide the voters and booths across census blocks.\n\nWhile this only affects a small percentage of rows, it could prevent a challenge for those polling stations. If you are having difficulties with this, please leave an issue or email [Luke Sonnet](lukesonnet.com).\n\n### Variable summary\n\nThroughout, a value of `-88` denotes that the data should have been reported but was missing on the forms, either due to a scanning error or some other ommission. Please check the variable distributions before using this data.\n\n* constituency_ps_id: *this is used to merge with PS level data*; the pasted together constituency_id and ps_id, a unique identifier of the constituency-polling station\n* constituency_ps_id_block_code: `constituency_ps_id` pasted with the block_code to generate a unique identifier for each polling station-census block code.\n* province\n* assembly\n* constituency_id\n* constituency_area\n* ps_id: the official serial number of the polling station\n* ps_name_from_form28: the name of the polling station; not always consistent within `constituency_ps_id`, and may not match the `ps_name` from the polling station data\n* block_code: the census block code; note, there are some `block_code` values that have dashes in them. These seem to represent more than one census block code (e.g. \"332030301-06-07\" seems to represent \"332030301\", \"332030306\", and \"332030307\"). Unfortunately this is how the census blocks were reported and we are unable to figure out which voters correspond to which of the census blocks. If you need help creating a full linking between polling stations and each of these block codes, please leave a message.\n* block_code_type: \"Urban\", \"Rural\", or \"Unknown\"\n* name_ea_rural: the name of the \"rural\" electoral area covered by this polling station-census block\n* name_ea_rural: the name of the \"urban\" electoral area covered by this polling station-census block\n* voter_serials_assigned_to_station: sometimes, the serial number range of the voters covered in this EA (per polling station) are reported. They are repeated here verbatim from the forms.\n* male_voters: the number of registered male voters in this polling station-census block\n* female_voters: the number of registered female voters in this polling station-census block\n* total_voters: the number of registered voters in this polling station-census block\n* male_booths: the number of assigned male booths for this polling station (Note: some errors seem to show different booths within polling station id in this dataset)\n* female_booths: the number of assigned female booths for this polling station (Note: some errors seem to show different booths within polling station id in this dataset)\n* total_booths: the total number of assigned booths for this polling station (Note: some errors seem to show different booths within polling station id in this dataset)\n\n\n\n## Table of variables\nThis table contains variable names, labels, their central tendencies and other attributes.\n\n|name |data_type |missing |complete |n |empty |n_unique |min |max |mean |sd |p0 |p25 |p50 |p75 |p100 |hist |\n|:---------------------------------|:---------|:-------|:--------|:------|:-----|:--------|:---|:---|:------|:------|:---|:---|:---|:---|:----|:--------|\n|constituency_ps_id_block_code |character |0 |461320 |461320 |0 |461320 |11 |62 |NA |NA |NA |NA |NA |NA |NA |NA |\n|constituency_ps_id |character |0 |461320 |461320 |0 |167616 |5 |9 |NA |NA |NA |NA |NA |NA |NA |NA |\n|province |character |0 |461320 |461320 |0 |4 |3 |11 |NA |NA |NA |NA |NA |NA |NA |NA |\n|assembly |character |0 |461320 |461320 |0 |2 |8 |10 |NA |NA |NA |NA |NA |NA |NA |NA |\n|constituency_id |character |0 |461320 |461320 |0 |849 |3 |5 |NA |NA |NA |NA |NA |NA |NA |NA |\n|constituency_area |character |0 |461320 |461320 |0 |682 |4 |54 |NA |NA |NA |NA |NA |NA |NA |NA |\n|ps_id |integer |0 |461320 |461320 |NA |NA |NA |NA |120.13 |94.13 |1 |48 |98 |168 |924 |▇▃▂▁▁▁▁▁ |\n|ps_name_from_form28 |character |0 |461320 |461320 |0 |83096 |3 |242 |NA |NA |NA |NA |NA |NA |NA |NA |\n|block_code |character |30 |461290 |461320 |0 |163074 |3 |52 |NA |NA |NA |NA |NA |NA |NA |NA |\n|block_code_type |character |0 |461320 |461320 |0 |3 |5 |7 |NA |NA |NA |NA |NA |NA |NA |NA |\n|name_ea_rural |character |170869 |290451 |461320 |0 |52597 |2 |244 |NA |NA |NA |NA |NA |NA |NA |NA |\n|name_ea_urban |character |291065 |170255 |461320 |0 |25992 |1 |244 |NA |NA |NA |NA |NA |NA |NA |NA |\n|voter_serials_assigned_to_station |character |69248 |392072 |461320 |0 |632 |1 |244 |NA |NA |NA |NA |NA |NA |NA |NA |\n|male_voters |integer |122 |461198 |461320 |NA |NA |NA |NA |252.54 |265.27 |0 |0 |208 |413 |3573 |▇▂▁▁▁▁▁▁ |\n|female_voters |integer |162 |461158 |461320 |NA |NA |NA |NA |199.47 |214.25 |0 |0 |158 |327 |2499 |▇▂▁▁▁▁▁▁ |\n|total_voters |integer |153 |461167 |461320 |NA |NA |NA |NA |458.75 |338.62 |0 |221 |389 |616 |4341 |▇▃▁▁▁▁▁▁ |\n|male_booths |integer |102 |461218 |461320 |NA |NA |NA |NA |1.57 |1.53 |-99 |0 |2 |2 |5 |▁▁▁▁▁▁▁▇ |\n|female_booths |integer |102 |461218 |461320 |NA |NA |NA |NA |1.37 |1.43 |-99 |0 |1 |2 |5 |▁▁▁▁▁▁▁▇ |\n|total_booths |integer |102 |461218 |461320 |NA |NA |NA |NA |2.94 |1.2 |-99 |2 |3 |4 |6 |▁▁▁▁▁▁▁▇ |\n\n### Note\nThis dataset was automatically described using the [codebook R package](https://rubenarslan.github.io/codebook/) (version 0.8.1).",
"identifier": "https://osf.io/mtsnd/",
"datePublished": "2019-10-21",
"creator": {
"@type": "Person",
"givenName": "Luke",
"familyName": "Sonnet",
"email": "luke.sonnet@gmail.com"
},
"citation": "Sonnet, Luke. 2019. “2018 Pakistani General Election Polling Station Data.” https://doi.org/10.17605/osf.io/mtsnd.",
"url": "https://osf.io/mtsnd/",
"keywords": ["constituency_ps_id_block_code", "constituency_ps_id", "province", "assembly", "constituency_id", "constituency_area", "ps_id", "ps_name_from_form28", "block_code", "block_code_type", "name_ea_rural", "name_ea_urban", "voter_serials_assigned_to_station", "male_voters", "female_voters", "total_voters", "male_booths", "female_booths", "total_booths"],
"@context": "http://schema.org/",
"@type": "Dataset",
"variableMeasured": [
{
"name": "constituency_ps_id_block_code",
"@type": "propertyValue"
},
{
"name": "constituency_ps_id",
"@type": "propertyValue"
},
{
"name": "province",
"@type": "propertyValue"
},
{
"name": "assembly",
"@type": "propertyValue"
},
{
"name": "constituency_id",
"@type": "propertyValue"
},
{
"name": "constituency_area",
"@type": "propertyValue"
},
{
"name": "ps_id",
"@type": "propertyValue"
},
{
"name": "ps_name_from_form28",
"@type": "propertyValue"
},
{
"name": "block_code",
"@type": "propertyValue"
},
{
"name": "block_code_type",
"@type": "propertyValue"
},
{
"name": "name_ea_rural",
"@type": "propertyValue"
},
{
"name": "name_ea_urban",
"@type": "propertyValue"
},
{
"name": "voter_serials_assigned_to_station",
"@type": "propertyValue"
},
{
"name": "male_voters",
"@type": "propertyValue"
},
{
"name": "female_voters",
"@type": "propertyValue"
},
{
"name": "total_voters",
"@type": "propertyValue"
},
{
"name": "male_booths",
"@type": "propertyValue"
},
{
"name": "female_booths",
"@type": "propertyValue"
},
{
"name": "total_booths",
"@type": "propertyValue"
}
]
}`