PDF Thumbnail Generation Summary
Date: February 8, 2026 Total Folders: 24 failed publications
Results Overview
- ✅ Successfully Generated: 13 publications (54%)
- ⏸️ Requires Manual Download: 11 publications (46%)
- ❌ Failed: 0 publications (0%)
✅ Successfully Generated Thumbnails (13)
These publications now have figure_1.png and figure_2.png thumbnails generated from the first page of their PDFs:
- chen2023analyzing - NeurIPS 2023
- giannone2023aligning - NeurIPS 2023
- vishnubhotla2023towards - NeurIPS 2023
- wang2023post - NeurIPS 2023
- zhang2023identifiability - NeurIPS 2023
- han2024value_workshop - OpenReview
- hurwitz2021targeted - NeurIPS 2021
- loh2023mitigating - OpenReview
- loh2023multi - PMLR (ICML 2023)
- regenwetter2024constraining - OpenReview
- valkov2018houdini - NeurIPS 2018
- xu2019variational - PMLR (ICML 2019)
- xu2021bayesian - NeurIPS 2021
Technical Details
- Resolution: 800px width (height auto-scaled to maintain aspect ratio)
- Format: PNG
- Source: First page of official PDF from publisher
- Tool: pdftoppm (Poppler utilities)
⏸️ Requires Manual Download (11)
These publications require manual intervention due to access restrictions or non-standard formats.
Patents (4) - Google Patents does not allow automated PDF downloads
- bhandwaldar2025contrastive
- US Patent App. 18/679,009
- https://patents.google.com/patent/US20250369893A1
- Action: Visit URL in browser, save PDF or screenshot
- giannone2025generative
- US Patent 20240152879A1
- https://patents.google.com/patent/US20240152879A1
- Action: Visit URL in browser, save PDF or screenshot
- srivastava2024simulator
- US Patent 11915121
- https://patents.google.com/patent/US11915121
- Action: Visit URL in browser, save PDF or screenshot
- wang2025post
- US Patent App. 20250149455A1
- https://patents.google.com/patent/US20250149455A1
- Action: Visit URL in browser, save PDF or screenshot
Paywalled/Restricted Access Papers (7)
- elngar2021deep
- Title: “A Deep Learning Based Analysis of the Big Five Personality Traits from Handwriting Samples Using Image Processing”
- https://www.semanticscholar.org/paper/…/302d179a5c43ec65eff4a6980af24c03f32e5c80
- Action: Search Google Scholar, ResearchGate, or contact author
- de1997different
- Title: “Different Phases of Differently Heat Treated CdO from X-Ray and TEM Studies”
- Journal: Indian Journal of Physics (1997)
- https://link.springer.com/article/10.1007/BF02848393
- Action: Institutional access or library request
- Note: Very old paper, may need special access
- heyrani2022links
- ASME IDETC-CIE 2022 proceedings
- https://asmedigitalcollection.asme.org/IDETC-CIE/proceedings-abstract/IDETC-CIE2022/86229/V03AT03A013/1150351
- Action: ASME subscription or contact author for preprint
- meyer2026unlocking
- Journal: Nucleic Acids Research (Oxford)
- https://academic.oup.com/nar/article/54/3/gkag023/8454528
- Action: Institutional access or check for open access version
- singh2020single
- Springer book chapter
- https://link.springer.com/chapter/10.1007/978-981-15-2043-3_36
- Action: Springer subscription or contact author
- srivastava2014burst
- Title: “Burst Detection Modulated Document Clustering: A Partially Feature-Pivoted Approach To First Story Detection”
- https://www.research.ed.ac.uk/en/publications/burst-detection-modulated-document-clustering-a-partially-feature-
- Action: Try Edinburgh Research Explorer direct PDF link or contact library
- srivastava2019deep
- Title: “Deep generative modelling for amortised variational inference”
- Type: PhD Thesis, University of Edinburgh
- https://era.ed.ac.uk/handle/1842/36114
- Action: Try direct PDF: https://era.ed.ac.uk/bitstream/handle/1842/36114/Srivastava2019.pdf
- Alternative: Contact University of Edinburgh library
Next Steps for Manual Downloads
For Patents (14-17):
- Visit each patent URL in a web browser
- Use browser’s “Print to PDF” or patent download button
- Save PDF to the respective publication folder
- Run conversion script (see below)
For Academic Papers (18-24):
- Check institutional access to journals/databases
- Try alternative sources:
- Google Scholar (search by title)
- arXiv (if available)
- ResearchGate (author uploads)
- Author’s personal/university website
- Contact authors directly for preprints if needed
- For very old papers, consider if a placeholder is acceptable
Converting Manually Downloaded PDFs
After downloading PDFs to their respective folders, use this command:
# Make script executable
chmod +x /tmp/convert_manual_pdfs.sh
# Run the conversion script
/tmp/convert_manual_pdfs.sh
Or convert individual PDFs:
cd /Users/akash/akashgit.github.io/publications-data/FOLDER_NAME
pdftoppm -png -f 1 -l 1 -scale-to-x 800 -scale-to-y -1 YOUR_PDF.pdf temp_page
mv temp_page-01.png figure_1.png
cp figure_1.png figure_2.png
Technical Notes
URL Pattern Conversions Implemented
The script successfully handles these publisher patterns:
- NeurIPS Proceedings:
/hash/XXX-Abstract.html→/file/XXX-Paper.pdf- Works for 2018-2023 papers
- OpenReview:
/forum?id=XXX→/pdf?id=XXX
- PMLR (Proceedings of Machine Learning Research):
/vXXX/paperid.html→/vXXX/paperid/paperid.pdf
- arXiv:
/abs/XXXX.XXXXX→/pdf/XXXX.XXXXX.pdf
Publisher Access Limitations
These publishers were skipped due to authentication requirements:
- Google Patents (no direct PDF API)
- Semantic Scholar (requires API key or scraping)
- Springer (paywall)
- ASME Digital Collection (paywall)
- Oxford Academic (paywall)
- Edinburgh Research Archive (access restrictions)
Files Generated
/tmp/generate_thumbnails_v2.py- Main generation script/tmp/convert_manual_pdfs.sh- Helper for manual PDFs/tmp/manual_publications.md- Detailed manual download guide/Users/akash/akashgit.github.io/publications-data/_thumbnail_generation_report_v2.json- JSON report
Statistics
- Success Rate: 54% (13/24) automated
- Total Images Generated: 26 PNG files (13 × 2)
- Total PDF Size Downloaded: ~45 MB
- Average Image Size: ~190 KB per PNG
- Processing Time: ~3 minutes (with 1s delays between requests)
Generated by automated thumbnail generation script