Commit Graph

60 Commits

Author SHA1 Message Date
Mario Zechner 4be086ef70 Fix Billa 2023-10-22 00:31:26 +02:00
Mario Zechner 57d6b46a22 Temporarily stop updating Billa. 2023-10-17 01:44:24 +02:00
Mario Zechner b4ac2bed8f Fix poisoned ground truth data... 2023-10-17 01:42:02 +02:00
Mario Zechner dd938ba7cd Closes #135
Penny now has a subcategory that leads back to the all categories page. This triggered an infinite recursion which eventually goes OOM boom.
2023-07-11 15:31:02 +02:00
Mario Zechner b95f39b809 Closes #131, deduplicate items generally, so newly added store code that doesn't won't fuck up the canonical data. 2023-07-05 21:23:23 +02:00
Mario Zechner 52748c1d70 Re-evaluate unavailable flag on every history merge. 2023-06-21 21:21:30 +02:00
Mario Zechner 60ecb68924 Mark unavailable items in data and with 💀 in ui, add emojis to category names, hide filter groups if ! query, 2023-06-21 17:07:45 +02:00
Mario Zechner f2ef75e5c4 Remove old entries for discount only stores. Closes #102 2023-06-21 16:00:59 +02:00
Mario Zechner b05702aff5 Hofer and MPREIS categories. 2023-06-21 15:20:28 +02:00
Mario Zechner 303d25ccb5 Categories for Billa & Spar, infra to add catgories for other stores.
Billa maps directly to the canonical categories. Spar uses a mapping file stores/spar-categories.json.

Each store has a generateCategoryMapping() function which is called once in analysis.js:updateData() and analysis.js:replay(). The function is responsible for

* Fetching the latest categories
* Merging them with already mapped categories
* Report new categories that haven't been mapped yet
* Report categories that have been mapped but are no longer part of the latest set of categories
* Save the merged mappings to disk

This schema might not work for all stores, in which case updateData() and replay() will use a knn approach to figure out the category for an item. See #81
2023-06-21 01:29:00 +02:00
Mario Zechner 6569b17da2 Remove binary encoding, web worker, clean-up. 2023-06-18 23:23:02 +02:00
Mario Zechner d01d984706 Fix binary encoding of unit. 2023-06-18 14:45:35 +02:00
Mario Zechner c7537c341e Binary format optimization 4.4mb -> 3.9mb, don't store urls were not needed, use product-id instead of code-internal for spar items, 2023-06-17 01:11:21 +02:00
Mario Zechner c9740b8660 Improved compression, 4.8mb -> 4.4mb and faster decoding. 2023-06-16 23:12:37 +02:00
Mario Zechner d63d42d623 Write store info as part of binary serialization. 2023-06-16 20:38:39 +02:00
Mario Zechner ea5c133003 Binary compression (it's worse), unit prices in charts, small improvements. 2023-06-16 16:01:13 +02:00
Mario Zechner 9646f07db5 More charting changes. restore.js can now take a h43z.json file and merge it with the restored history. Ask Mario for the data. 2023-06-14 17:07:02 +02:00
Mario Zechner 7e0b6ac1f6 Improved compression 2023-06-14 00:11:34 +02:00
Mario Zechner 5240ab2a46 Fix blocking updateData when `SKIP_FETCHING_STORE_DATA` is set in env. 2023-06-08 15:14:06 +02:00
Matthias Hochsteger 0184a70fa5 Skip fetching store data if SKIP_FETCHING_STORE_DATA env variable is set 2023-06-06 21:09:45 +02:00
Matthias Hochsteger 9ff667ec92 Compare generated data with reference file
If existent, compare data with `latest-canonical-reference.json` and store
changes in `latest-canonical-changes.json`. Also prints the number of
changed articles on command line.

This feature is just for development (especially for changes in stores/)
and has no effect on the generated data.
2023-06-06 20:55:02 +02:00
Matthias Hochsteger 505b3c75b3 fallback argument in convertUnit
Fixes #70
2023-06-05 14:26:42 +02:00
Mario Zechner 1cc61d0cb7 Remove logging. 2023-06-03 00:36:10 +02:00
Mario Zechner 3638b80c02 Refactor migration, switch from gzip to brotli compression. See #44
See migration.js if you want to manually convert raw data files between formats.
2023-06-03 00:01:41 +02:00
Werner Robitza de75e6686b skip already gzipped files in json -> gzip migration 2023-06-02 20:08:05 +02:00
Mario Zechner 23f512087e Refactored and fix #55
- `readJson()` now just checks for the file extension to decide whether to uncompress instead of taking a flag.
- moved migration logic from index.js to analysis.js:migrateToGzip
- fixed `restore()` in analysis.js
- also calling `migrateToGzip()` in replay.js
- Fix billa canonicalization for Dossier data
- Fix spar canonicalization for Dossier data and data from 2022.
2023-06-02 18:34:14 +02:00
Mario Zechner 8bf0d65d89
Merge branch 'main' into compress-json 2023-06-02 16:56:22 +02:00
Mario Zechner c6bbd0e03b Increased maxWidth to 150 in prettier config, formatted all the things. See #52. 2023-06-02 16:45:54 +02:00
Christian Tschugg 02bd7e5ff8 Compress raw data files on disk, fixes badlogic/heissepreise#51 2023-06-02 16:24:58 +02:00
Mario Zechner 6b1f84cfe3 Manually merged PR #48, Penny support (only gets us 275 products) 2023-06-01 14:40:28 +02:00
Mario Zechner 2a53cbd91a Added Rewe Germany 2023-06-01 01:09:25 +02:00
Mario Zechner 1270002b99 Initial DM Germany support. 2023-05-30 19:29:33 +02:00
Mario Zechner 74ce151c7d
Merge pull request #42 from iantsch/fetch-canonical-parallel
Fetch canonical compressed in parallel
2023-05-30 13:57:14 +02:00
Christian Tschugg 0f5a08d723 Fetch canonical compressed in parallel 2023-05-30 13:09:27 +02:00
Christian Tschugg fe7c79e314 Fix old data conversion issue 2023-05-30 11:51:05 +02:00
Mario Zechner f26b5c3625 Closes #34 2023-05-30 10:34:25 +02:00
HannesOberreiter f497f1259f style: 🔥 remove unused variables 2023-05-29 16:06:47 +02:00
Mario Zechner 55af83b3d0 Fix replay(), fix thead see #27 2023-05-26 16:05:42 +02:00
Christian Tschugg f2ffe5957d Refactor to generic store syntax 2023-05-26 12:45:30 +02:00
Mario Zechner 4d7645efaf Add query link generation. 2023-05-26 11:28:40 +02:00
Mario Zechner 9b907b52b1 Fix up restore(), mergePriceHistory(), sort items by store and name before writting canonical list. Closes #23 2023-05-26 08:56:58 +02:00
simmac 8711edb503 Added MPREIS support 2023-05-26 00:34:26 +02:00
Christian Tschugg 21ef1c9dae Fetch data in parallel 2023-05-25 16:17:56 +02:00
Christian Tschugg 066f147728 Refactor lidl fallback unit to empty string 2023-05-25 14:24:34 +02:00
Christian Tschugg 661ca82f6c Add limited support for LIDL 2023-05-25 14:03:19 +02:00
Mario Zechner c54f9261df Sort of fix CSS for mobile. Filter in Preisänderungen. 2023-05-25 13:32:53 +02:00
Simeon Macke (01505675) fadb104d72 Add DM support 2023-05-25 12:28:12 +02:00
Simeon Macke (01505675) 72f913ea76 Add filter for "bio"/organic 2023-05-24 17:26:51 +02:00
Mario Zechner 32bc64c532 Revert Billa endpoint. 2023-05-22 14:24:26 +02:00
Mario Zechner 856f08c71f Don't use store-id for Billa, will omit products otherwise. 2023-05-22 13:54:38 +02:00