-
-
Notifications
You must be signed in to change notification settings - Fork 240
Issues: adbar/trafilatura
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Validate value of New feature or request
output_format
in extract()
and bare_extraction()
enhancement
#644
opened Jul 16, 2024 by
adbar
Missing h1 heading if <header> outside of <article>
question
Further information is requested
#642
opened Jul 11, 2024 by
chrisgoddard
AttributeError in baseline extraction of JSON text
bug
Something isn't working
#640
opened Jul 8, 2024 by
Honesty-of-the-Cavernous-Tissue
Can I get an extracted element's CSS selector?
question
Further information is requested
#639
opened Jul 4, 2024 by
theabhinavdas
links/urls are not apprearing using extract
question
Further information is requested
#636
opened Jul 1, 2024 by
alroythalus
some extraction duplicated in xml
question
Further information is requested
#634
opened Jun 27, 2024 by
fortyfourforty
Account for empty cells in table extraction (xml)
enhancement
New feature or request
#633
opened Jun 27, 2024 by
fortyfourforty
Sometimes, html tags remain on the string
bug
Something isn't working
#627
opened Jun 23, 2024 by
masylum
Parts of article block are sometimes not being extracted
feedback
Feedback from users requested
#622
opened Jun 17, 2024 by
naktinis
Image/Video caption and credits removal
documentation
Docs in need of update or extension
question
Further information is requested
#616
opened Jun 6, 2024 by
hamsarajan
It's set include_images=True, but there is no picture
bug
Something isn't working
#610
opened May 31, 2024 by
dark2star
Remove HTML doc pages from package and add instructions to build them
documentation
Docs in need of update or extension
maintenance
Software compability and continuity
#609
opened May 30, 2024 by
adbar
New port of readability.js?
question
Further information is requested
#604
opened May 23, 2024 by
zirkelc
Add option to provide XPaths for content extraction
enhancement
New feature or request
#596
opened May 16, 2024 by
klvbdmh
utils.decode_file()
: add switch for full detection or GZip only
enhancement
#595
opened May 15, 2024 by
adbar
Extracting content from an URl is getting none
question
Further information is requested
#586
opened May 5, 2024 by
Fabiha15
Wrong links position in text from telegram post
question
Further information is requested
#585
opened May 4, 2024 by
RedHotUnicorn
Removing related links at end of article/sidebar on news websites?
bug
Something isn't working
#584
opened May 3, 2024 by
rahulbot
Update XML-TEI reference data
maintenance
Software compability and continuity
#577
opened Apr 29, 2024 by
adbar
Extract text from buttons for semantic elements
question
Further information is requested
#573
opened Apr 23, 2024 by
zirkelc
Why lzma for data compression?
maintenance
Software compability and continuity
question
Further information is requested
#559
opened Apr 15, 2024 by
Yomguithereal
Preserve horizontal space in code blocks
enhancement
New feature or request
#553
opened Apr 9, 2024 by
mittsommer
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.