| Task | #3 |
|---|---|
| Position | 2 |
| Score | 39,606 |
| Best Score | 38,681 |
| Points | 0.977 |
Problem Statement
You have a data stream of records to STDIN. Each record is encoded in JSON. The structure is:
{
"user_id": 0, // Max id is 10 000
"currency": "" // One of "GBP", "USD", "RUB", "JPY", "CHF"
"transactions": [ // Max count is 10
{
"amount": 0, // Max value is 1000
"to_user_id": 0,
"canceled": false // Can be missed if false
}
]
}
The fields order is not guaranteed.
Try to find the total amount of external transactions (record.user_id != record.transaction.to_user_id) in USD for the shortest time.
Input Generator
The output of probe.cpp is:
Amount range: [0,999]
User id range: [0,9999]
Txn user id range: [0,9999]
Num records: 1000000
Num USD records: 199503
Total Transactions: 4496378
External: 4495932
Explicitly cancelled: 2473701
Explicitly cancelled to true: 450056
Total USD transactions: 898189
External: 898104
Txn len counts:
0: 100309
1: 99810
2: 100319
3: 100283
4: 99994
5: 99769
6: 99995
7: 99901
8: 99597
9: 100023
10: 0
So:
- USD records make up ~20% of records
- Most USD transactions are external
- Transaction numbers per record are uniformly distributed between 0 and 9
Solution Sketch
Llke most HighLoad tasks, this is not a JSON parsing task. SimdJSON gets a score of ~250k, so you need to find a way to eliminate most of the input without parsing it.
The key trick is to notice that USD records can be uniquely identified with the S character. That is, every S character in the input maps uniquely to one USD record. This lets us eliminate 80% of the input.
Once you find an S, there are six cases:
1. {"currency": "USD", "user_id": 123, "transactions": [...]}
2. {"currency": "USD", "transactions": [...], "user_id": 123}
3. {"user_id", 123, "currency": "USD", "transactions": [...]}
4. {"transactions": [...], "currency": "USD", "user_id", 123}
5. {"user_id", 123, "transactions": [...], "currency": "USD"}
6. {"transactions": [...], "user_id", 123, "currency": "USD"}
Depending on where you encounter the S, you either need to parse the transactions forward or backward. Optimize out the fast paths (eg empty transactions), and have two separate code paths for the forward and backward case.
For transactions, I used a switch with jumps and didn’t hardcode the six cases. Maybe you can do even more casework and make a lookup table for parsing. There are <100 possible lengths of a record, and maybe it’s possible to encode some extraction logic in each LUT cell. My hypothesis is it’ll be as fast, or slower than pure record parsing.
Push the parsed user IDs, amounts and canceled’s to arrays. Once you’ve processed the entire record and got the user ID out, use SIMD with some masking to isolate the amounts we need to add. Since they’re small enough, you can accumulate in parallel using maddubs, with the multiplier being 0 or 1 depending on whether the amount was invalid (cancelled, internal) or not respectively.
Thoughts
Like most parsing state machines, this was also a pain to debug, and not something I want to touch again. There are so many edge cases everywhere.
Some tips from making state machines: I use consteval int cslen(std::string_view s) { return s.size(); } for not having magic numbers when I’m doing pointer jumps when parsing. i += cslen(" }") is more understandable than i += 2, and helps immensely when debugging.
If you don’t want to do this crazy optimization, just make a tight parser with a bunch of labels and gotos. It’s good enough for ~60k.