# 🎯 KOMPAKTNE NIMEKIRI: 700 artiklist 700 teadusartikli kĂ€sitlemine on suur andmehulk. Siin on optimaalsed lahendused kompaktsete nimeliste saamiseks. --- ## 📊 LAHENDUSED ### **Variant 1: Python skript (PARIM)** ⭐ ```bash python3 compact_articles_list.py ``` **Loob:** - `articles_compact_list.csv` - Excel-ile - `articles_compact_list.html` - Veebis vaatamine + sorteerimine - `articles_compact_list.md` - Markdown tabel **VĂ€ljad:** ``` | # | Pealkiri | Aasta | Ćœurnaal | Allikfail | Ülevaade (30 sĂ”na) | Relevants/10 | ``` **Eelised:** - ✅ HTML on interaktiivne (sorteerimine, filtreerimine) - ✅ CSV on Exceli-nÔÔps - ✅ Markdown on VS Code'is mugav - ✅ Ülevaated Ă€ra lĂ”igatud (30 sĂ”na) - loetav! --- ### **Variant 2: Bash + CURL + jq (KIIRE)** ```bash chmod +x compact_curl_list.sh ./compact_curl_list.sh ``` **Loob:** - `articles_compact.csv` - `articles_compact.md` **Eelised:** - ✅ Kiire (pole Pythoni init) - ✅ VĂ€hem failisid - ✅ jq kasutab ĂŒlevaadete lĂ”ikamist (20 sĂ”na) --- ### **Variant 3: Lihtne CURL kĂ€sk (KÄSITSI)** ```bash # CSV formaadis curl -s http://100.80.222.54:9020/v1/graphql \ -X POST \ -H "Content-Type: application/json" \ -d '{ "query": "{ Get { ScientificArticle(limit: 50) { title year journal abstract_en relevance_score } } }" }' \ | jq -r '.data.Get.ScientificArticle[] | [.title, .year, .journal, .relevance_score] | @csv' \ > ~/Downloads/articles.csv # Markdown tabel curl -s http://100.80.222.54:9020/v1/graphql \ -X POST \ -H "Content-Type: application/json" \ -d '{ "query": "{ Get { ScientificArticle(limit: 50) { title year journal relevance_score } } }" }' \ | jq -r '.data.Get.ScientificArticle[] | "| \(.title) | \(.year) | \(.journal) | \(.relevance_score)/10 |"' \ > ~/Downloads/articles.md ``` --- ## 🔍 VÄLJAD - KOMPAKTSES PÄRINGUS ```graphql { Get { ScientificArticle(limit: 100) { # PĂ”hivalik - KOMPAKTNE title # Artikli pealkiri source_file # PDF allikfail year # Avaldamise aasta journal # Ajakirja nimi abstract_en # LĂŒhike kokkuvĂ”te relevance_score # Relevantsus (0-10) # Valikuline - suurendab hulka # doi # DOI ID # authors # Autorite nimekiri # key_concepts # VĂ”tmesĂ”nad # processing_date # Töötlemise kuupĂ€ev } } } ``` --- ## ⚡ OPTIMISEERIMISE NIPID ### **1. Limit muuta suuremaks** Asemel `limit: 100`, kasuta `limit: 700` (vĂ”i kuni 10,000): ```graphql ScientificArticle(limit: 700) { title source_file year } ``` ⚠ **Hoiatus:** 700+ artikli kohta vĂ”tab ~ 5-15 sekundit. ### **2. Filtreerimine vĂ”tmesĂ”nade jĂ€rgi** ```graphql ScientificArticle( limit: 100 where: { path: "key_concepts" operator: ContainsAny valueString: ["transport", "road safety"] } ) { title key_concepts } ``` ### **3. Sorteerimine relevantsuse jĂ€rgi** ```graphql ScientificArticle( limit: 100 sort: {path: "relevance_score", order: desc} ) { title relevance_score } ``` ### **4. Ühendamine - ainult ĂŒlemused 10 sĂ”na** ```bash # jq-s .abstract_en | split(" ") | .[0:10] | join(" ") ``` --- ## 📈 JÕUDLUSE VÕRDLUS | Lahendus | VĂ€ljundid | Kiirus | Mugavus | |----------|-----------|--------|---------| | Python | CSV, HTML, MD | 3-5s | ⭐⭐⭐⭐⭐ | | Bash | CSV, MD | 2-3s | ⭐⭐⭐⭐ | | CURL kĂ€sk | JSON/CSV | 1-2s | ⭐⭐⭐ | --- ## đŸ’Ÿ FAILI FORMAADID ### **CSV** (Excel, LibreOffice) ``` title,year,journal,abstract,relevance "Article Title",2024,"Nature","Abstract text...",9 ``` ### **HTML** (Veebis sorteeritav) ```html ......
PealkiriAasta
Article2024
``` ### **Markdown** (VS Code) ```markdown | Pealkiri | Aasta | Relevants | |----------|-------|-----------| | Article | 2024 | 9/10 | ``` --- ## 🚀 KIIRSTART **KĂ”ige lihtsam:** ```bash python3 compact_articles_list.py ``` **KĂ”ige kiirem:** ```bash ./compact_curl_list.sh ``` **KĂ€sitsi testi:** ```bash curl -s http://100.80.222.54:9020/v1/graphql \ -X POST \ -H "Content-Type: application/json" \ -d '{"query": "{ Get { ScientificArticle(limit: 5) { title year } } }"}' \ | jq . ``` --- ## 🔧 KOHANDAMINE ### Python skriptis: ```python # Rida 13: VĂ€ljad COMPACT_QUERY = { "query": """{ Get { ScientificArticle(limit: 100) { title source_file year # Lisa siia rohkem vĂ€lju } ``` ### Bash skriptis: ```bash # Rida 15: GraphQL pĂ€ring -d '{ "query": "{ Get { ScientificArticle(limit: 100) { title source_file # Lisa siia rohkem vĂ€lju }" ``` --- ## 💡 NÄPUNÄITED 1. **Exceli sorteerimine:** - Data → AutoFilter - Kliki veeru pĂ€ises ↓ - Vali "Sort A to Z" vĂ”i "Sort by Values" 2. **HTML avamine brauseris:** ```bash open ~/Downloads/articles_compact_list.html # VĂ”i Windows: start ~/Downloads/articles_compact_list.html ``` 3. **Markdown VS Code'is:** ```bash code ~/Downloads/articles_compact_list.md ``` 4. **CSV impordi Google Sheetsi:** - Ava sheets.google.com - File → Import → Upload → Vali CSV - Valemi jĂ€tkamine otse tabelis 5. **PDF printimiseks:** - HTML faili avamiseks: Print (Ctrl+P) - Vali: Save as PDF - MĂ€rgi: Background graphics --- ## ⚠ PIIRANGUD WEAVIATE'is - **Maksimumtulemused:** 10,000 objekti - **Offset + Limit summa:** max 10,000 - **PĂ€ringute ajapiirang:** 30 sekundit - **Performance:** Suured piirangud on aeglasemad **Lahendus:** Kasuta pagination (offset + limit) suurte andmete jaoks: ```bash # Esimesed 100 limit: 100, offset: 0 # JĂ€rgmised 100 limit: 100, offset: 100 ``` --- ## 📞 TÕRKEOTSING | Probleem | Lahendus | |----------|----------| | "jq: command not found" | `sudo apt install jq` | | "Python: ModuleError" | `pip3 install --upgrade pip` | | "Timeout" (aegunud pĂ€ring) | VĂ€henda `limit` vÀÀrtust (100 asemel 50) | | HTML nĂ€eb valesti vĂ€lja | Ava Firefox'is, mitte Internet Exploreris | | CSV avatakse valesti | Import: UTF-8 encoding, semicolon delimiter | --- **Valminud:** 09.01.2026 **Versioon:** 2.0 **Format:** Kompaktne, optimeeritud 700+ artikli jaoks