Research » Bookmaking and Archiving Content

Research » Bookmaking and Archiving Content

Best Free Site Archiving Services

Help:Using - Wikipedia

Blazingly fast and Memento API compatible Link to the most recent URL archive - Wikipedia

Comparison between the Wayback Machine and Archive.Today - EverybodyWiki Bios & Wiki

Internet Archive aka Wayback Machine

Latest URL archive


Web archives support the Memento protocol natively: Memento Depot

Archive Wallabag Links Forever

Adding an URL to Wallabag

Is complicated more than simple cURL

Obtain API token

API token is expiring and is obtained from Client ID, Client Secret, username and user password:

curl -s ""

Create a client_id+client_secret to access Wallabag via API and then run the curl (the password must be URL encoded):

$ curl -i "
  &client_id=<the_client_id> \
  &client_secret=<the_client_secret> \
  &username=<username> \

Again: Password must be URL encoded.

Then add an URL

cURL request must be POST and not GET, so the following WON’T work: … this one is not working:

curl "<access_token>&url=<url>"

This one works:

curl -d 'access_token=<access_token>&url=<url>'

Archiving URL’s that were posted to Wallabag

Everything is bookmarked using Wallabag and then via RSS distributed to other archival engines.

  1. Save to Wallabag. Expose RSS feed of URLs
  2. Also save to Notion, separate database “The New Pocket”, just in case
  3. Also save in for the best

I’ve implemented most of the automation tasks using and

In the future, I could host my own ArchiveBox as it is amazing. That will be only 2-step process.

Posting URL to


Za, nije tako lako: archiving - How do I archive a webpage to using wget or curl? - Web Applications Stack Exchange wabarc/ A command-line tool and Go package for wayback webpage to

jjjake/internetarchive: A Python and Command-Line Interface to


awesome-selfhosted/awesome-selfhosted: A list of Free Software network services and web applications which can be hosted on your own servers

Quite serious competitors:

Pinboard & Pocket Alternatives

Wallabag Setup on ISPConfig

Wallabag is open-source self-hostable Pocket alternative with full-text search, text and media archiving and a lot of clients for all the mobile platforms and e-book Kindle platform.

Must set internal option to “Download images locally”.

Nginx options in ISPConfig:

client_max_body_size 3000m;

##subroot web ##

location / {
    try_files $uri /app.php$is_args$args;

location ~ \.php$ {
  fastcgi_split_path_info ^(.+\.php)(/.*)$;
  include /etc/nginx/fastcgi_params;
  fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
  fastcgi_param DOCUMENT_ROOT $document_root;


Installation on a shared hosting

The static package requires each command to be appended by --env=prod as the static package is only usable as a prod environment. You must create your first user by using the command php bin/console wallabag:install --env=prod If an error occurs at this step due to bad settings, you must clear the cache with php bin/console cache:clear --env=prod before you try again the previous command.

Default configuration uses MySQL for the database and database setting are inside app/config/parameters.yml. Passwords must be surrounded by single quotes (’). Same config file is used for other configuration options like domain_name and others.


Let some files being executable:

cd web/bin chmod g+x *

Trigger import:

bin/console wallabag:import:redis-worker –env=prod pinboard -vv » import-pinboard.log

New entrant in 2023:

Briefkasten Bookmarks Briefkasten is a very attractive and open-source project that is well coded. The only limitation is that the “full-text” search function only searches the description of the links and not the content of the pages linked to.

Github: ndom91/briefkasten

How to save a web page on the Internet Archive? | LearnTips

Web Archiving Services | AlternativeTo

Spisak svih arhivera:

Archive Aggregators: servis koji pretražuje sve arhive i daje ti koja ima arhivirano a koja nema

Alternatives Ghostarchive, a website archive - savršeno, uses from Webrecorder Tools FreezePage - čudno Library of Congress Web Archives - ne može da se arhivira? Stanford Web Archive Portal - ne može da se arhivira?

Wayback Machine Wayback Machine (All) Google Cache Google Cache (Text-Only) Bing Cache Yandex Cache (All)

Gigablast Cache Yahoo Japan Cache Megalodon Baidu Snapshot Yahoo Cache Qihoo 360 Search Snapshot Cache

Valjda ovako? a ne može kasnije

? Archive.St - Free web page archiving service

Da arhiviram sam sa wget?

Tutorial: Back Up a Web Page or Web Site – Data Horde

Common Crawl

jsvine/waybackpack: Download the entire Wayback Machine archive for a given URL.

motherboardgithub/mass_archive: A basic tool for pushing a web page to multiple archiving services at once. Python tool for pushing a web page to multiple archiving services at once

oduwsdl/archivenow: A Tool To Push Web Resources Into Web Archives is a very interesting idea and tool that checks how suitable a website is for archiving, it’s known as “Website Archivability Evaluation Tool"Tool”

Browser Extensions

Archive Site Extensions
Retrieve Archive Extensions

a button to the Mozilla Firefox toolbar. When clicked, it sends the URL of the current tab to to preserve a snapshot of the page, and opens the result in a new tab. Adds a button to the Mozilla Firefox toolbar. When clicked, it sends the URL of the current tab to to preserve a snapshot of the page, and opens the result in a new tab.


Stephen Ostermiller’s Cache Bookmarklets

JS Small: Stephen Ostermiller’s Cache Bookmarklets

Android dodaci

  • Share2Archive will open web page in Archive.Today using the default browser

Update Wallabag

make update only runs fine if you installed wallabag using git. If you installed it using the shared hosting way (the tar archive) you have to follow the manual process:

runuser www-data -s /bin/bash

# standard permissions
chmod 755 ./{web,var,bin,vendor,app/config} -R

# more permissions
chmod -R 775 var/{logs,cache}/

bin/console cache:clear --env=prod

# chown -R www-data:www-data /srv/wallabag/{web,var,bin,vendor,app/config}
cat /etc/group | grep www-data

# ad web27 to group www-data
usermod -a -G www-data web27

# log
tail var/logs/prod.log -f -n0


Omnivore je najzad savršena Wallabag alternativa koja radi sa Puppeteer i to baš sve šta želim. Self-hosted je sa repo na omnivore-app/omnivore: Omnivore is a complete, open source read-it-later solution for people who like reading. ali možeš za sada besplatno da koristiš njihovu hosted verziju. Dokumentacija je ovde Docs a napominjem da omnivore will perform full text search across library item’s content, title, description, and site by default - read here Search | Omnivore Docs

date 09. Nov 2022 | modified 10. Jun 2024
filename: Research » Bookmarking and Archiving Content