Research » Bookmaking and Archiving Content
Best Free Site Archiving Services
archive.today
Help:Using archive.today - Wikipedia
Blazingly fast and Memento API compatible Link to the most recent URL archive http://archive.is/http://en.wikipedia.org/
archive.today - Wikipedia https://archive.is/?url=https://www.cnc24.com/
Comparison between the Wayback Machine and Archive.Today - EverybodyWiki Bios & Wiki
Internet Archive aka Wayback Machine
https://web.archive.org/web/wikipedia.org
Arquivo.pt
Resources:
Web archives support the Memento protocol natively: Memento Depot
Archive Wallabag Links Forever
Adding an URL to Wallabag
Is complicated more than simple cURL
https://doc.wallabag.org/en/developer/api/oauth.html
Obtain API token
API token is expiring and is obtained from Client ID, Client Secret, username and user password:
|
|
Create a client_id+client_secret to access Wallabag via API and then run the curl (the password must be URL encoded):
|
|
Again: Password must be URL encoded.
Then add an URL
cURL request must be POST and not GET, so the following WON’T work: … this one is not working:
|
|
This one works:
|
|
Archiving URL’s that were posted to Wallabag
Everything is bookmarked using Wallabag and then via RSS distributed to other archival engines.
- Save to Wallabag. Expose RSS feed of URLs
- Also save to Notion, separate database “The New Pocket”, just in case
- Also save in Archive.org for the best
I’ve implemented most of the automation tasks using pipedream.com and ifttt.com
In the future, I could host my own ArchiveBox as it is amazing. That will be only 2-step process.
Archive.org
Posting URL to Archive.org:
|
|
Za archive.today, nije tako lako: archiving - How do I archive a webpage to archive.today using wget or curl? - Web Applications Stack Exchange wabarc/archive.is: A command-line tool and Go package for wayback webpage to archive.today
jjjake/internetarchive: A Python and Command-Line Interface to Archive.org
Wallabag
Quite serious competitors:
Pinboard & Pocket Alternatives
-
Shiori in Go, has web extensions and phone problem solve with amazing HTTP Shortcuts
-
shaarli/Shaarli: The personal, minimalist, super-fast, database free, bookmarking service in PHP
-
MarceauKa/shaark: Self-hosted platform to keep and share your content: web links, posts, passwords and pictures. inspired by Shaarli, built with Laravel and Vue.js
-
Reminiscence is Python
-
WebCrate - Organize your Web and Deta as personal Cloud
-
Pincone → Bookmark manager for teams. • Pincone but Personal Pincone is 100% free.
Paid
-
Pocket košta oko 3.4€/m
-
Raindrop.io — Keep your favorites handy, nije skup: oko 2.7€/m
-
Bookmarks.io - Better Bookmarking is defunct
-
Odličan tekst o izboru Article Extraction libraryja, odnosno keyword extractors koje su koristili: Unsupervised Auto-labeling of Websites • Pincone Inače, autori su Zagrebačka firma Ars Futura – product design, mobile and web app development agency
Wallabag Setup on ISPConfig
Wallabag is open-source self-hostable Pocket alternative with full-text search, text and media archiving and a lot of clients for all the mobile platforms and e-book Kindle platform.
Must set internal option to “Download images locally”.
Nginx options in ISPConfig:
client_max_body_size 3000m;
##subroot web ##
location / {
try_files $uri /app.php$is_args$args;
}
location ~ \.php$ {
fastcgi_split_path_info ^(.+\.php)(/.*)$;
{FASTCGIPASS}
include /etc/nginx/fastcgi_params;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
fastcgi_param DOCUMENT_ROOT $document_root;
internal;
}
Symfony
https://symfony.com/doc/current/setup/web_server_configuration.html#nginx
Installation on a shared hosting
The static package requires each command to be appended by --env=prod
as the static package is only usable as a prod environment.
You must create your first user by using the command php bin/console wallabag:install --env=prod
If an error occurs at this step due to bad settings, you must clear the cache with php bin/console cache:clear --env=prod
before you try again the previous command.
Default configuration uses MySQL for the database and database setting are inside app/config/parameters.yml
. Passwords must be surrounded by single quotes (').
Same config file is used for other configuration options like domain_name
and others.
Importing
https://doc.wallabag.org/en/admin/asynchronous.html#launch-redis-consumer
Let some files being executable:
cd web/bin chmod g+x *
Trigger import:
bin/console wallabag:import:redis-worker –env=prod pinboard -vv » import-pinboard.log
New entrant in 2023:
Briefkasten Bookmarks Briefkasten is a very attractive and open-source project that is well coded. The only limitation is that the “full-text” search function only searches the description of the links and not the content of the pages linked to.
Github: ndom91/briefkasten
How to save a web page on the Internet Archive? | LearnTips
Web Archiving Services | AlternativeTo
Spisak svih arhivera:
Archive Aggregators: servis koji pretražuje sve arhive i daje ti koja ima arhivirano a koja nema
- Memento Time Travel - to samo daje spisak sa drugih, agregator
- Tutorial: Back Up a Web Page or Web Site – Data Horde
- Cached Pages
- CachedView
Alternatives Ghostarchive, a website archive - savršeno, uses ReplayWeb.page from Webrecorder Tools FreezePage - čudno Library of Congress Web Archives - ne može da se arhivira? Stanford Web Archive Portal - ne može da se arhivira?
Wayback Machine Wayback Machine (All) Google Cache Google Cache (Text-Only) Bing Cache Yandex Cache Archive.is Archive.is (All)
Gigablast Cache Yahoo Japan Cache Megalodon Baidu Snapshot Yahoo Cache Qihoo 360 Search Snapshot Mail.ru Cache
Valjda ovako? http://timetravel.mementoweb.org/memento/2022/https://www.cnc24.com/ a ne može kasnije
? Archive.St - Free web page archiving service
Da arhiviram sam sa wget?
Tutorial: Back Up a Web Page or Web Site – Data Horde
jsvine/waybackpack: Download the entire Wayback Machine archive for a given URL.
motherboardgithub/mass_archive: A basic tool for pushing a web page to multiple archiving services at once. Python tool for pushing a web page to multiple archiving services at once
oduwsdl/archivenow: A Tool To Push Web Resources Into Web Archives
ArchiveReady.com is a very interesting idea and tool that checks how suitable a website is for archiving, it’s known as “Website Archivability Evaluation Tool"Tool”
Browser Extensions
Archive Site Extensions
-
thefoofighter/The-Archiver-WebExtension both for Firefox and Chrome, will archive on Archive.org and Archive.Today
-
rahiel/archiveror for Firefox works on Archive.org and Archive.Today plus some more. In Chrome, it can make local copies of webpages in a single MHTML file using Ctrl+Shift+S. For Firefox consider the “Save Page WE” add-on.
-
Archive Page for Firefox and Chrome is only for Archive.Today, both for archiving and searching for archive
-
tjhorner/archivebox-exporter for Firefox and Chrome can send pages from your browser to your ArchiveBox self-hosted archiver
-
AaronLenoir/SendToArchive is basic Firefox extension for Archive.org
-
arantius/resurrect-pages searches through page archives on Firefox and some fork? Albirew/resurrect-pages-isup-edition
-
jonathanmccann/archive-url-firefox-addon is Firefox only and only Archive.org
Retrieve Archive Extensions
- dessant/web-archives for Firefox and Chrome is extension for viewing archived and cached versions of web pages
a button to the Mozilla Firefox toolbar. When clicked, it sends the URL of the current tab to archive.today to preserve a snapshot of the page, and opens the result in a new tab. Adds a button to the Mozilla Firefox toolbar. When clicked, it sends the URL of the current tab to archive.today to preserve a snapshot of the page, and opens the result in a new tab.
Tool:
- ArchiveTeam/grab-site: The archivist’s web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
- birros/web-archives: A web archives reader
Stephen Ostermiller’s Cache Bookmarklets
JS Small: Stephen Ostermiller’s Cache Bookmarklets
- Resurrect Pages – Get this Extension for 🦊 Firefox (en-US) arantius/resurrect-pages: A tool to expose cached copies of webpages, especially when they are unavailable.
Android dodaci
- Share2Archive will open web page in Archive.Today using the default browser
- PaperSpan nije nešto epohalan