Research » Bookmaking and Archiving Content

Research » Bookmaking and Archiving Content

Best Free Site Archiving Services

archive.today

Blazingly fast and Memento API compatible Link to the most recent URL archive http://archive.is/http://en.wikipedia.org/

Internet Archive aka Wayback Machine

Latest URL archive

https://web.archive.org/web/wikipedia.org

Arquivo.pt

Resources:

Web archives support the Memento protocol natively: Memento Depot


Pinboard & Pocket Alternatives

Raindrop.io — Keep your favorites handy Dropmark | Organize, collaborate, and share online REVISIT.IO - Better bookmarking Bookmarks.io - Better Bookmarking


Archive Wallabag Links Forever

Adding an URL to Wallabag

Is complicated more than simple cURL

https://doc.wallabag.org/en/developer/api/oauth.html

Obtain API token

API token is expiring and is obtained from Client ID, Client Secret, username and user password:

1
curl -s "https://pocket.cvladan.com/oauth/v2/token?grant_type=password&client_id=3_1jkgq7s2if8k84gskwwwcc00ss8ggwgsgo4s0owsw80w04w4ks&client_secret=nlc6w6c065cg0g8o84sks8og04gg008s48w8w4ws40g8ww0ss&username=cvladan&password=kr5manija+Padobran"

Create a client_id+client_secret to access Wallabag via API and then run the curl (the password must be URL encoded):

1
2
3
4
5
$ curl -i "https://pocket.cvladan.com/oauth/v2/token?grant_type=password
  &client_id=<the_client_id> \
  &client_secret=<the_client_secret> \
  &username=<username> \
  &password=<urlencoded_password>

Again: Password must be URL encoded.

Then add an URL

cURL request must be POST and not GET, so the following WON’T work: … this one is not working:

1
curl "https://pocket.cvladan.com/api/entries.json?access_token=<access_token>&url=<url>"

This one works:

1
curl -d 'access_token=<access_token>&url=<url>' https://pocket.cvladan.com/api/entries.json

Archiving URL’s that were posted to Wallabag

Everything is bookmarked using Wallabag and then via RSS distributed to other archival engines.

  1. Save to Wallabag. Expose RSS feed of URLs
  2. Also save to Notion, separate database “The New Pocket”, just in case
  3. Also save in Archive.org for the best

I’ve implemented most of the automation tasks using pipedream.com and ifttt.com

In the future, I could host my own ArchiveBox as it is amazing. That will be only 2-step process.

Archive.org

Posting URL to Archive.org:

1
curl https://web.archive.org/save/https://www.cnc24.com/

Wallabag

awesome-selfhosted/awesome-selfhosted: A list of Free Software network services and web applications which can be hosted on your own servers

Quite serious competitors:

Wallabag Setup on ISPConfig

Wallabag is open-source self-hostable Pocket alternative with full-text search, text and media archiving and a lot of clients for all the mobile platforms and e-book Kindle platform.

Must set internal option to “Download images locally”.

Nginx options in ISPConfig:

client_max_body_size 3000m;

##subroot web ##

location / {
    try_files $uri /app.php$is_args$args;
}

location ~ \.php$ {
  fastcgi_split_path_info ^(.+\.php)(/.*)$;
  {FASTCGIPASS}
  include /etc/nginx/fastcgi_params;
  fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
  fastcgi_param DOCUMENT_ROOT $document_root;
  internal;
}

Symfony

https://symfony.com/doc/current/setup/web_server_configuration.html#nginx


Installation on a shared hosting

The static package requires each command to be appended by --env=prod as the static package is only usable as a prod environment. You must create your first user by using the command php bin/console wallabag:install --env=prod If an error occurs at this step due to bad settings, you must clear the cache with php bin/console cache:clear --env=prod before you try again the previous command.

Default configuration uses MySQL for the database and database setting are inside app/config/parameters.yml. Passwords must be surrounded by single quotes ('). Same config file is used for other configuration options like domain_name and others.


Importing

https://doc.wallabag.org/en/admin/asynchronous.html#launch-redis-consumer

Let some files being executable:

cd web/bin chmod g+x *

Trigger import:

bin/console wallabag:import:redis-worker –env=prod pinboard -vv » import-pinboard.log

date 01. Jan 0001 | modified 10. Nov 2022
filename: Research » Bookmaking and Archiving Content