Aims of this homework
- learn about the basics of supervised machine learning in R
- reproduce the fake news classification example
- use the caret interface
Task 1: Preparing the data
You will use the fake news dataset from the lecture. Load that dataset (located in ./data/fakenews_corpus.RData
.
Now you will need to extract features from the text. To do this, load the quanteda package, and run the code below that will extract unigrams, apply DFM trimming and bind the ‘outcome’ variable to the datafram:
library(quanteda)
corpus_tokenised = tokens(fakenews_corpus)
ngrams_extract_1 = dfm(x = corpus_tokenised
, ngrams = 1
, verbose = F
, remove_punct = T
, stem = F
, remove = stopwords()
)
ngrams_extract_1 = dfm_trim(ngrams_extract_1, sparsity = 0.95)
fake_news_data = as.data.frame(ngrams_extract_1)
fake_news_data$outcome = fakenews_corpus$documents$veracity
fake_news_data = fake_news_data[, -1]
Have a look at the first 10 rows and 10 columns of the dataframe fake_news_data
:
Task 2: Splitting the data
We covered in the lecture, that you would need to split your data into a training set and a test set. Load the caret
package and use the createDataPartition
function to split the data into a test set of 65% of the data and a training set of 35% of the data.
#your code here
library(caret)
Loading required package: lattice
Loading required package: ggplot2
Note: since the partitioning of the data involves a random number initialisation, you will get different results every time you run this code. To avoid this (especially in your assignment), you can set a “seed” that ensures that the random number generation is identical if you run the code again. To do this, use the set.seed
function BEFORE running the createDataPartition
function.
Task 3: Training your model
Use a linear Support Vector Machine algorithm and train your model on the training data.
fakenews_model_1 = train(outcome ~ .
, data = training_data
, trControl = training_controls
, method = "svmLinear"
)
Error in train.default(x, y, weights = w, ...) :
object 'training_controls' not found
Task 4: Assessing your model
We discussed why you would want to assess your model on the held-out test data.
To illustrate the differences, evaluate your model on the TRAINING data using the predict
function:
table(training_data$outcome, model_1.predictions_trainingset)
model_1.predictions_trainingset
fake real
fake 650 0
real 0 650
Now do the same on the TEST data. Remember that the test data was never seen by the model and did therefore not occur in the training phase:
table(test_data$outcome, model_1.predictions_testset)
model_1.predictions_testset
fake real
fake 293 57
real 64 286
What does this show you?
Task 5: Including training control parameters
We also covered why you would want to apply cross-validation on your model. Include a training control object in your model training phase with a 10 fold cross-validation. Use the same training and test set but build a new model that includes the 10-fold cross-validation:
training_controls = trainControl(method="cv"
, number = 10
)
fakenews_model_2 = train(outcome ~ .
, data = training_data
, trControl = training_controls
, method = "svmLinear"
)
model_2.predictions_testset = predict(fakenews_model_2, test_data)
table(test_data$outcome, model_2.predictions_testset)
model_2.predictions_testset
fake real
fake 293 57
real 64 286
Do these results differ from the model without cross-validation?
table(test_data$outcome, model_2.predictions_testset) == table(test_data$outcome, model_1.predictions_testset)
model_2.predictions_testset
fake real
fake TRUE TRUE
real TRUE TRUE
Task 6: Using k-fold cross-validation on the full dataset
You can also use k-fold cross-validation on the full dataset and iteratively use one fold as the test set (have a look at this SO post. Try to implement this in R with 10 folds (hint: you do not need the train/test partition).
#your code here
fakenews_model_3 = train(outcome ~ .
, data = fake_news_data
, trControl = training_controls
, method = "svmLinear"
)
Now if you try to evaluate the model, you will not that the predict
function is of little help. This is because it would just fit a model to its own training data. Have a look at this:
model_3.predictions = predict(fakenews_model_3, fake_news_data)
table(fake_news_data$outcome, model_3.predictions)
model_3.predictions
fake real
fake 1000 0
real 1 999
Instead, you want to retrieve the average performance of each of the test sets of the 10 folds. You can do this by using the confusionMatrix
function with the model as parameter.
confusionMatrix(fakenews_model_3)
Cross-Validated (10 fold) Confusion Matrix
(entries are percentual average cell counts across resamples)
Reference
Prediction fake real
fake 40.5 9.8
real 9.4 40.2
Accuracy (average) : 0.8075
How do these results compare to the previous ones?
Task 7: Using different classification algorithms
Keep the meta parameters as in your second model (65/35 split, 10 fold CV on the training set) and use the k-nearest neighbour (kNN) classifier. Read about kNN here, here and here.
table(test_data$outcome, model_4.predictions_testset)
model_4.predictions_testset
fake real
fake 332 18
real 230 120
Task 8: The data, data, data
argument
You will often hear that more data is preferable over fewer data and that especially in ML applications, you need ideally vast amounts of training data.
Look at the effect of the size of the training set by building a model with a 10/90, 20/80, 30/70, and 40/60 training/test set split.
set.seed(1234)
in_training = createDataPartition(y = fake_news_data$outcome
, p = .30
, list = FALSE
)
training_data = fake_news_data[ in_training,]
test_data = fake_news_data[-in_training,]
training_controls = trainControl(method="cv"
, number = 5
)
fakenews_model_3070 = train(outcome ~ .
, data = training_data
, trControl = training_controls
, method = "svmLinear"
)
model_3070.predictions_testset = predict(fakenews_model_3070, test_data)
table(test_data$outcome, model_3070.predictions_testset)
model_3070.predictions_testset
fake real
fake 561 139
real 189 511
What do you observe?
END
LS0tCnRpdGxlOiAiTWFjaGluZSBsZWFybmluZyAxIgpzdWJ0aXRlOiAiSG9tZXdvcmsgd2VlayA2IgphdXRob3I6ICJCIEtsZWluYmVyZyIKc3VidGl0bGU6IEFkdmFuY2VkIENyaW1lIEFuYWx5c2lzLCBVQ0wKb3V0cHV0OiBodG1sX25vdGVib29rCi0tLQoKIyMgQWltcyBvZiB0aGlzIGhvbWV3b3JrCgotIGxlYXJuIGFib3V0IHRoZSBiYXNpY3Mgb2Ygc3VwZXJ2aXNlZCBtYWNoaW5lIGxlYXJuaW5nIGluIFIKLSByZXByb2R1Y2UgdGhlIGZha2UgbmV3cyBjbGFzc2lmaWNhdGlvbiBleGFtcGxlCi0gdXNlIHRoZSBjYXJldCBpbnRlcmZhY2UKCgojIyBUYXNrIDE6IFByZXBhcmluZyB0aGUgZGF0YQoKWW91IHdpbGwgdXNlIHRoZSBmYWtlIG5ld3MgZGF0YXNldCBmcm9tIHRoZSBsZWN0dXJlLiBMb2FkIHRoYXQgZGF0YXNldCAobG9jYXRlZCBpbiBgLi9kYXRhL2Zha2VuZXdzX2NvcnB1cy5SRGF0YWAuCgpgYGB7cn0KI3lvdXIgY29kZSBoZXJlCmxvYWQoJy4vZGF0YS9mYWtlbmV3c19jb3JwdXMuUkRhdGEnKQpgYGAKCk5vdyB5b3Ugd2lsbCBuZWVkIHRvIGV4dHJhY3QgZmVhdHVyZXMgZnJvbSB0aGUgdGV4dC4gVG8gZG8gdGhpcywgbG9hZCB0aGUgcXVhbnRlZGEgcGFja2FnZSwgYW5kIHJ1biB0aGUgY29kZSBiZWxvdyB0aGF0IHdpbGwgZXh0cmFjdCB1bmlncmFtcywgYXBwbHkgREZNIHRyaW1taW5nIGFuZCBiaW5kIHRoZSAnb3V0Y29tZScgdmFyaWFibGUgdG8gdGhlIGRhdGFmcmFtOgoKYGBge3J9CmxpYnJhcnkocXVhbnRlZGEpCmNvcnB1c190b2tlbmlzZWQgPSB0b2tlbnMoZmFrZW5ld3NfY29ycHVzKQpuZ3JhbXNfZXh0cmFjdF8xID0gZGZtKHggPSBjb3JwdXNfdG9rZW5pc2VkCiAgICAgICAgICAgICAgICAgICAgICAgLCBuZ3JhbXMgPSAxCiAgICAgICAgICAgICAgICAgICAgICAgLCB2ZXJib3NlID0gRgogICAgICAgICAgICAgICAgICAgICAgICwgcmVtb3ZlX3B1bmN0ID0gVAogICAgICAgICAgICAgICAgICAgICAgICwgc3RlbSA9IEYKICAgICAgICAgICAgICAgICAgICAgICAsIHJlbW92ZSA9IHN0b3B3b3JkcygpCiAgICAgICAgICAgICAgICAgICAgICAgKQpuZ3JhbXNfZXh0cmFjdF8xID0gZGZtX3RyaW0obmdyYW1zX2V4dHJhY3RfMSwgc3BhcnNpdHkgPSAwLjk1KQpmYWtlX25ld3NfZGF0YSA9IGFzLmRhdGEuZnJhbWUobmdyYW1zX2V4dHJhY3RfMSkKZmFrZV9uZXdzX2RhdGEkb3V0Y29tZSA9IGZha2VuZXdzX2NvcnB1cyRkb2N1bWVudHMkdmVyYWNpdHkKZmFrZV9uZXdzX2RhdGEgPSBmYWtlX25ld3NfZGF0YVssIC0xXQpgYGAKCkhhdmUgYSBsb29rIGF0IHRoZSBmaXJzdCAxMCByb3dzIGFuZCAxMCBjb2x1bW5zIG9mIHRoZSBkYXRhZnJhbWUgYGZha2VfbmV3c19kYXRhYDoKCmBgYHtyfQojeW91ciBjb2RlIGhlcmUKZmFrZV9uZXdzX2RhdGFbMToxMCwgMToxMF0KYGBgCgoKIyMgVGFzayAyOiBTcGxpdHRpbmcgdGhlIGRhdGEKCldlIGNvdmVyZWQgaW4gdGhlIGxlY3R1cmUsIHRoYXQgeW91IHdvdWxkIG5lZWQgdG8gc3BsaXQgeW91ciBkYXRhIGludG8gYSB0cmFpbmluZyBzZXQgYW5kIGEgdGVzdCBzZXQuIExvYWQgdGhlIGBjYXJldGAgcGFja2FnZSBhbmQgdXNlIHRoZSBgY3JlYXRlRGF0YVBhcnRpdGlvbmAgZnVuY3Rpb24gdG8gc3BsaXQgdGhlIGRhdGEgaW50byBhIHRlc3Qgc2V0IG9mIDY1JSBvZiB0aGUgZGF0YSBhbmQgYSB0cmFpbmluZyBzZXQgb2YgMzUlIG9mIHRoZSBkYXRhLgoKYGBge3J9CiN5b3VyIGNvZGUgaGVyZQpsaWJyYXJ5KGNhcmV0KQpzZXQuc2VlZCgxMjM0KQppbl90cmFpbmluZyA9IGNyZWF0ZURhdGFQYXJ0aXRpb24oeSA9IGZha2VfbmV3c19kYXRhJG91dGNvbWUKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICwgcCA9IC42NQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgLCBsaXN0ID0gRkFMU0UKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICkKdHJhaW5pbmdfZGF0YSA9IGZha2VfbmV3c19kYXRhWyBpbl90cmFpbmluZyxdCnRlc3RfZGF0YSA9IGZha2VfbmV3c19kYXRhWy1pbl90cmFpbmluZyxdCmBgYAoKTm90ZTogc2luY2UgdGhlIHBhcnRpdGlvbmluZyBvZiB0aGUgZGF0YSBpbnZvbHZlcyBhIHJhbmRvbSBudW1iZXIgaW5pdGlhbGlzYXRpb24sIHlvdSB3aWxsIGdldCBkaWZmZXJlbnQgcmVzdWx0cyBldmVyeSB0aW1lIHlvdSBydW4gdGhpcyBjb2RlLiBUbyBhdm9pZCB0aGlzIChlc3BlY2lhbGx5IGluIHlvdXIgYXNzaWdubWVudCksIHlvdSBjYW4gc2V0IGEgInNlZWQiIHRoYXQgZW5zdXJlcyB0aGF0IHRoZSByYW5kb20gbnVtYmVyIGdlbmVyYXRpb24gaXMgaWRlbnRpY2FsIGlmIHlvdSBydW4gdGhlIGNvZGUgYWdhaW4uIFRvIGRvIHRoaXMsIHVzZSB0aGUgYHNldC5zZWVkYCBmdW5jdGlvbiBCRUZPUkUgcnVubmluZyB0aGUgYGNyZWF0ZURhdGFQYXJ0aXRpb25gIGZ1bmN0aW9uLgoKCiMjIFRhc2sgMzogVHJhaW5pbmcgeW91ciBtb2RlbAoKVXNlIGEgbGluZWFyIFN1cHBvcnQgVmVjdG9yIE1hY2hpbmUgYWxnb3JpdGhtIGFuZCB0cmFpbiB5b3VyIG1vZGVsIG9uIHRoZSB0cmFpbmluZyBkYXRhLgoKYGBge3J9CiN5b3VyIGNvZGUgaGVyZQpmYWtlbmV3c19tb2RlbF8xID0gdHJhaW4ob3V0Y29tZSB+IC4KICAgICAgICAgICAgICAgICAgICAgICAsIGRhdGEgPSB0cmFpbmluZ19kYXRhCiAgICAgICAgICAgICAgICAgICAgICAgLCBtZXRob2QgPSAic3ZtTGluZWFyIgogICAgICAgICAgICAgICAgICAgICAgICkKYGBgCgoKIyMgVGFzayA0OiBBc3Nlc3NpbmcgeW91ciBtb2RlbAoKV2UgZGlzY3Vzc2VkIHdoeSB5b3Ugd291bGQgd2FudCB0byBhc3Nlc3MgeW91ciBtb2RlbCBvbiB0aGUgaGVsZC1vdXQgdGVzdCBkYXRhLiAKClRvIGlsbHVzdHJhdGUgdGhlIGRpZmZlcmVuY2VzLCBldmFsdWF0ZSB5b3VyIG1vZGVsIG9uIHRoZSBUUkFJTklORyBkYXRhIHVzaW5nIHRoZSBgcHJlZGljdGAgZnVuY3Rpb246CgpgYGB7cn0KI3lvdXIgY29kZSBoZXJlCm1vZGVsXzEucHJlZGljdGlvbnNfdHJhaW5pbmdzZXQgPSBwcmVkaWN0KGZha2VuZXdzX21vZGVsXzEsIHRyYWluaW5nX2RhdGEpCnRhYmxlKHRyYWluaW5nX2RhdGEkb3V0Y29tZSwgbW9kZWxfMS5wcmVkaWN0aW9uc190cmFpbmluZ3NldCkKYGBgCgpOb3cgZG8gdGhlIHNhbWUgb24gdGhlIFRFU1QgZGF0YS4gUmVtZW1iZXIgdGhhdCB0aGUgdGVzdCBkYXRhIHdhcyBuZXZlciBzZWVuIGJ5IHRoZSBtb2RlbCBhbmQgZGlkIHRoZXJlZm9yZSBub3Qgb2NjdXIgaW4gdGhlIHRyYWluaW5nIHBoYXNlOgoKYGBge3J9CiN5b3VyIGNvZGUgaGVyZQptb2RlbF8xLnByZWRpY3Rpb25zX3Rlc3RzZXQgPSBwcmVkaWN0KGZha2VuZXdzX21vZGVsXzEsIHRlc3RfZGF0YSkKdGFibGUodGVzdF9kYXRhJG91dGNvbWUsIG1vZGVsXzEucHJlZGljdGlvbnNfdGVzdHNldCkKYGBgCgpXaGF0IGRvZXMgdGhpcyBzaG93IHlvdT8KCiMjIFRhc2sgNTogSW5jbHVkaW5nIHRyYWluaW5nIGNvbnRyb2wgcGFyYW1ldGVycwoKV2UgYWxzbyBjb3ZlcmVkIHdoeSB5b3Ugd291bGQgd2FudCB0byBhcHBseSBjcm9zcy12YWxpZGF0aW9uIG9uIHlvdXIgbW9kZWwuIEluY2x1ZGUgYSB0cmFpbmluZyBjb250cm9sIG9iamVjdCBpbiB5b3VyIG1vZGVsIHRyYWluaW5nIHBoYXNlIHdpdGggYSAxMCBmb2xkIGNyb3NzLXZhbGlkYXRpb24uIFVzZSB0aGUgc2FtZSB0cmFpbmluZyBhbmQgdGVzdCBzZXQgYnV0IGJ1aWxkIGEgbmV3IG1vZGVsIHRoYXQgaW5jbHVkZXMgdGhlIDEwLWZvbGQgY3Jvc3MtdmFsaWRhdGlvbjoKCmBgYHtyfQojeW91ciBjb2RlIGhlcmUKdHJhaW5pbmdfY29udHJvbHMgPSB0cmFpbkNvbnRyb2wobWV0aG9kPSJjdiIKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgLCBudW1iZXIgPSAxMAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICApCgpmYWtlbmV3c19tb2RlbF8yID0gdHJhaW4ob3V0Y29tZSB+IC4KICAgICAgICAgICAgICAgICAgICAgICAsIGRhdGEgPSB0cmFpbmluZ19kYXRhCiAgICAgICAgICAgICAgICAgICAgICAgLCB0ckNvbnRyb2wgPSB0cmFpbmluZ19jb250cm9scwogICAgICAgICAgICAgICAgICAgICAgICwgbWV0aG9kID0gInN2bUxpbmVhciIKICAgICAgICAgICAgICAgICAgICAgICApCgptb2RlbF8yLnByZWRpY3Rpb25zX3Rlc3RzZXQgPSBwcmVkaWN0KGZha2VuZXdzX21vZGVsXzIsIHRlc3RfZGF0YSkKdGFibGUodGVzdF9kYXRhJG91dGNvbWUsIG1vZGVsXzIucHJlZGljdGlvbnNfdGVzdHNldCkKYGBgCgpEbyB0aGVzZSByZXN1bHRzIGRpZmZlciBmcm9tIHRoZSBtb2RlbCB3aXRob3V0IGNyb3NzLXZhbGlkYXRpb24/CgpgYGB7cn0KdGFibGUodGVzdF9kYXRhJG91dGNvbWUsIG1vZGVsXzIucHJlZGljdGlvbnNfdGVzdHNldCkgPT0gdGFibGUodGVzdF9kYXRhJG91dGNvbWUsIG1vZGVsXzEucHJlZGljdGlvbnNfdGVzdHNldCkKYGBgCgojIyBUYXNrIDY6IFVzaW5nIGstZm9sZCBjcm9zcy12YWxpZGF0aW9uIG9uIHRoZSBmdWxsIGRhdGFzZXQKCllvdSBjYW4gYWxzbyB1c2Ugay1mb2xkIGNyb3NzLXZhbGlkYXRpb24gb24gdGhlIGZ1bGwgZGF0YXNldCBhbmQgaXRlcmF0aXZlbHkgdXNlIG9uZSBmb2xkIGFzIHRoZSB0ZXN0IHNldCAoaGF2ZSBhIGxvb2sgYXQgdGhpcyBbU08gcG9zdF0oaHR0cHM6Ly9zdGF0cy5zdGFja2V4Y2hhbmdlLmNvbS9xdWVzdGlvbnMvMjcwMDI3L3doYXQtaXMtdGhlLWRpZmZlcmVuY2UtYmV0d2Vlbi1rLWhvbGRvdXQtYW5kLWstZm9sZC1jcm9zcy12YWxpZGF0aW9uLzI3MDAzMCkuIFRyeSB0byBpbXBsZW1lbnQgdGhpcyBpbiBSIHdpdGggMTAgZm9sZHMgKGhpbnQ6IHlvdSBkbyBub3QgbmVlZCB0aGUgdHJhaW4vdGVzdCBwYXJ0aXRpb24pLgoKYGBge3J9CiN5b3VyIGNvZGUgaGVyZQpmYWtlbmV3c19tb2RlbF8zID0gdHJhaW4ob3V0Y29tZSB+IC4KICAgICAgICAgICAgICAgICAgICAgICAsIGRhdGEgPSBmYWtlX25ld3NfZGF0YQogICAgICAgICAgICAgICAgICAgICAgICwgdHJDb250cm9sID0gdHJhaW5pbmdfY29udHJvbHMKICAgICAgICAgICAgICAgICAgICAgICAsIG1ldGhvZCA9ICJzdm1MaW5lYXIiCiAgICAgICAgICAgICAgICAgICAgICAgKQoKYGBgCgpOb3cgaWYgeW91IHRyeSB0byBldmFsdWF0ZSB0aGUgbW9kZWwsIHlvdSB3aWxsIG5vdCB0aGF0IHRoZSBgcHJlZGljdGAgZnVuY3Rpb24gaXMgb2YgbGl0dGxlIGhlbHAuIFRoaXMgaXMgYmVjYXVzZSBpdCB3b3VsZCBqdXN0IGZpdCBhIG1vZGVsIHRvIGl0cyBvd24gdHJhaW5pbmcgZGF0YS4gSGF2ZSBhIGxvb2sgYXQgdGhpczoKCmBgYHtyfQojeW91ciBjb2RlIGhlcmUKCm1vZGVsXzMucHJlZGljdGlvbnMgPSBwcmVkaWN0KGZha2VuZXdzX21vZGVsXzMsIGZha2VfbmV3c19kYXRhKQp0YWJsZShmYWtlX25ld3NfZGF0YSRvdXRjb21lLCBtb2RlbF8zLnByZWRpY3Rpb25zKQoKYGBgCgpJbnN0ZWFkLCB5b3Ugd2FudCB0byByZXRyaWV2ZSB0aGUgYXZlcmFnZSBwZXJmb3JtYW5jZSBvZiBlYWNoIG9mIHRoZSB0ZXN0IHNldHMgb2YgdGhlIDEwIGZvbGRzLiBZb3UgY2FuIGRvIHRoaXMgYnkgdXNpbmcgdGhlIGBjb25mdXNpb25NYXRyaXhgICBmdW5jdGlvbiB3aXRoIHRoZSBtb2RlbCBhcyBwYXJhbWV0ZXIuCgpgYGB7cn0KY29uZnVzaW9uTWF0cml4KGZha2VuZXdzX21vZGVsXzMpCmBgYAoKSG93IGRvIHRoZXNlIHJlc3VsdHMgY29tcGFyZSB0byB0aGUgcHJldmlvdXMgb25lcz8KCiMjIFRhc2sgNzogVXNpbmcgZGlmZmVyZW50IGNsYXNzaWZpY2F0aW9uIGFsZ29yaXRobXMKCktlZXAgdGhlIG1ldGEgcGFyYW1ldGVycyBhcyBpbiB5b3VyIHNlY29uZCBtb2RlbCAoNjUvMzUgc3BsaXQsIDEwIGZvbGQgQ1Ygb24gdGhlIHRyYWluaW5nIHNldCkgYW5kIHVzZSB0aGUgay1uZWFyZXN0IG5laWdoYm91ciAoa05OKSBjbGFzc2lmaWVyLiBSZWFkIGFib3V0IGtOTiBbaGVyZV0oaHR0cHM6Ly9tZWRpdW0uY29tL0BhZGkuYnJvbnNodGVpbi9hLXF1aWNrLWludHJvZHVjdGlvbi10by1rLW5lYXJlc3QtbmVpZ2hib3JzLWFsZ29yaXRobS02MjIxNGNlYTI5YzcpLCBbaGVyZV0oaHR0cHM6Ly90b3dhcmRzZGF0YXNjaWVuY2UuY29tL21hY2hpbmUtbGVhcm5pbmctYmFzaWNzLXdpdGgtdGhlLWstbmVhcmVzdC1uZWlnaGJvcnMtYWxnb3JpdGhtLTZhNmU3MWQwMTc2MSkgYW5kIFtoZXJlXShodHRwczovL3d3dy5yLWJsb2dnZXJzLmNvbS9rLW5lYXJlc3QtbmVpZ2hib3Itc3RlcC1ieS1zdGVwLXR1dG9yaWFsLykuCgpgYGB7cn0KI3lvdXIgY29kZSBoZXJlCnNldC5zZWVkKDEyMzQpCmluX3RyYWluaW5nID0gY3JlYXRlRGF0YVBhcnRpdGlvbih5ID0gZmFrZV9uZXdzX2RhdGEkb3V0Y29tZQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgLCBwID0gLjY1CiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAsIGxpc3QgPSBGQUxTRQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgKQp0cmFpbmluZ19kYXRhID0gZmFrZV9uZXdzX2RhdGFbIGluX3RyYWluaW5nLF0KdGVzdF9kYXRhID0gZmFrZV9uZXdzX2RhdGFbLWluX3RyYWluaW5nLF0KCnRyYWluaW5nX2NvbnRyb2xzID0gdHJhaW5Db250cm9sKG1ldGhvZD0iY3YiCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICwgbnVtYmVyID0gMTAKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgKQoKZmFrZW5ld3NfbW9kZWxfNCA9IHRyYWluKG91dGNvbWUgfiAuCiAgICAgICAgICAgICAgICAgICAgICAgLCBkYXRhID0gdHJhaW5pbmdfZGF0YQogICAgICAgICAgICAgICAgICAgICAgICwgdHJDb250cm9sID0gdHJhaW5pbmdfY29udHJvbHMKICAgICAgICAgICAgICAgICAgICAgICAsIG1ldGhvZCA9ICJrbm4iCiAgICAgICAgICAgICAgICAgICAgICAgKQoKbW9kZWxfNC5wcmVkaWN0aW9uc190ZXN0c2V0ID0gcHJlZGljdChmYWtlbmV3c19tb2RlbF80LCB0ZXN0X2RhdGEpCnRhYmxlKHRlc3RfZGF0YSRvdXRjb21lLCBtb2RlbF80LnByZWRpY3Rpb25zX3Rlc3RzZXQpCgpgYGAKCiMjIFRhc2sgODogVGhlIGBkYXRhLCBkYXRhLCBkYXRhYCBhcmd1bWVudAoKWW91IHdpbGwgb2Z0ZW4gaGVhciB0aGF0IG1vcmUgZGF0YSBpcyBwcmVmZXJhYmxlIG92ZXIgZmV3ZXIgZGF0YSBhbmQgdGhhdCBlc3BlY2lhbGx5IGluIE1MIGFwcGxpY2F0aW9ucywgeW91IG5lZWQgaWRlYWxseSB2YXN0IGFtb3VudHMgb2YgdHJhaW5pbmcgZGF0YS4KCkxvb2sgYXQgdGhlIGVmZmVjdCBvZiB0aGUgc2l6ZSBvZiB0aGUgdHJhaW5pbmcgc2V0IGJ5IGJ1aWxkaW5nIGEgbW9kZWwgd2l0aCBhIDEwLzkwLCAyMC84MCwgMzAvNzAsIGFuZCA0MC82MCB0cmFpbmluZy90ZXN0IHNldCBzcGxpdC4KCmBgYHtyfQojeW91ciBjb2RlIGhlcmUKCiMxMC85MApzZXQuc2VlZCgxMjM0KQppbl90cmFpbmluZyA9IGNyZWF0ZURhdGFQYXJ0aXRpb24oeSA9IGZha2VfbmV3c19kYXRhJG91dGNvbWUKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICwgcCA9IC4xMAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgLCBsaXN0ID0gRkFMU0UKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICkKdHJhaW5pbmdfZGF0YSA9IGZha2VfbmV3c19kYXRhWyBpbl90cmFpbmluZyxdCnRlc3RfZGF0YSA9IGZha2VfbmV3c19kYXRhWy1pbl90cmFpbmluZyxdCgp0cmFpbmluZ19jb250cm9scyA9IHRyYWluQ29udHJvbChtZXRob2Q9ImN2IgogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAsIG51bWJlciA9IDUKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgKQoKZmFrZW5ld3NfbW9kZWxfMTA5MCA9IHRyYWluKG91dGNvbWUgfiAuCiAgICAgICAgICAgICAgICAgICAgICAgLCBkYXRhID0gdHJhaW5pbmdfZGF0YQogICAgICAgICAgICAgICAgICAgICAgICwgdHJDb250cm9sID0gdHJhaW5pbmdfY29udHJvbHMKICAgICAgICAgICAgICAgICAgICAgICAsIG1ldGhvZCA9ICJzdm1MaW5lYXIiCiAgICAgICAgICAgICAgICAgICAgICAgKQoKbW9kZWxfMTA5MC5wcmVkaWN0aW9uc190ZXN0c2V0ID0gcHJlZGljdChmYWtlbmV3c19tb2RlbF8xMDkwLCB0ZXN0X2RhdGEpCnRhYmxlKHRlc3RfZGF0YSRvdXRjb21lLCBtb2RlbF8xMDkwLnByZWRpY3Rpb25zX3Rlc3RzZXQpCgojMjAvODAKc2V0LnNlZWQoMTIzNCkKaW5fdHJhaW5pbmcgPSBjcmVhdGVEYXRhUGFydGl0aW9uKHkgPSBmYWtlX25ld3NfZGF0YSRvdXRjb21lCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAsIHAgPSAuMjAKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICwgbGlzdCA9IEZBTFNFCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICApCnRyYWluaW5nX2RhdGEgPSBmYWtlX25ld3NfZGF0YVsgaW5fdHJhaW5pbmcsXQp0ZXN0X2RhdGEgPSBmYWtlX25ld3NfZGF0YVstaW5fdHJhaW5pbmcsXQoKdHJhaW5pbmdfY29udHJvbHMgPSB0cmFpbkNvbnRyb2wobWV0aG9kPSJjdiIKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgLCBudW1iZXIgPSA1CiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICkKCmZha2VuZXdzX21vZGVsXzIwODAgPSB0cmFpbihvdXRjb21lIH4gLgogICAgICAgICAgICAgICAgICAgICAgICwgZGF0YSA9IHRyYWluaW5nX2RhdGEKICAgICAgICAgICAgICAgICAgICAgICAsIHRyQ29udHJvbCA9IHRyYWluaW5nX2NvbnRyb2xzCiAgICAgICAgICAgICAgICAgICAgICAgLCBtZXRob2QgPSAic3ZtTGluZWFyIgogICAgICAgICAgICAgICAgICAgICAgICkKCm1vZGVsXzIwODAucHJlZGljdGlvbnNfdGVzdHNldCA9IHByZWRpY3QoZmFrZW5ld3NfbW9kZWxfMjA4MCwgdGVzdF9kYXRhKQp0YWJsZSh0ZXN0X2RhdGEkb3V0Y29tZSwgbW9kZWxfMjA4MC5wcmVkaWN0aW9uc190ZXN0c2V0KQoKCiMzMC83MApzZXQuc2VlZCgxMjM0KQppbl90cmFpbmluZyA9IGNyZWF0ZURhdGFQYXJ0aXRpb24oeSA9IGZha2VfbmV3c19kYXRhJG91dGNvbWUKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICwgcCA9IC4zMAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgLCBsaXN0ID0gRkFMU0UKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICkKdHJhaW5pbmdfZGF0YSA9IGZha2VfbmV3c19kYXRhWyBpbl90cmFpbmluZyxdCnRlc3RfZGF0YSA9IGZha2VfbmV3c19kYXRhWy1pbl90cmFpbmluZyxdCgp0cmFpbmluZ19jb250cm9scyA9IHRyYWluQ29udHJvbChtZXRob2Q9ImN2IgogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAsIG51bWJlciA9IDUKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgKQoKZmFrZW5ld3NfbW9kZWxfMzA3MCA9IHRyYWluKG91dGNvbWUgfiAuCiAgICAgICAgICAgICAgICAgICAgICAgLCBkYXRhID0gdHJhaW5pbmdfZGF0YQogICAgICAgICAgICAgICAgICAgICAgICwgdHJDb250cm9sID0gdHJhaW5pbmdfY29udHJvbHMKICAgICAgICAgICAgICAgICAgICAgICAsIG1ldGhvZCA9ICJzdm1MaW5lYXIiCiAgICAgICAgICAgICAgICAgICAgICAgKQoKbW9kZWxfMzA3MC5wcmVkaWN0aW9uc190ZXN0c2V0ID0gcHJlZGljdChmYWtlbmV3c19tb2RlbF8zMDcwLCB0ZXN0X2RhdGEpCnRhYmxlKHRlc3RfZGF0YSRvdXRjb21lLCBtb2RlbF8zMDcwLnByZWRpY3Rpb25zX3Rlc3RzZXQpCgojIy4uLgpgYGAKCldoYXQgZG8geW91IG9ic2VydmU/CgoKIyMgRU5ECgotLS0K