sfeed

Simple RSS and Atom feed parser
git clone https://git.sinitax.com/codemadness/sfeed
Log | Files | Refs | README | LICENSE | Upstream | sfeed.txt

commit 645ef7420056796e6d2716bf920b8704451912ac
parent 2f8a83288d91ea0abc2e4ebd6754513ee3ad37ec
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date:   Wed, 27 Jan 2021 13:07:45 +0100

typofixes

Diffstat:
MREADME | 20++++++++++----------
Msfeed.1 | 8++++----
Msfeed.5 | 2+-
Msfeed_web.1 | 16++++++++--------
Msfeedrc.5 | 10+++++-----
Mutil.c | 2+-
Mxml.c | 2+-
7 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/README b/README @@ -143,7 +143,7 @@ sfeed_mbox - Format feed data (TSV) to mbox. sfeed_plain - Format feed data (TSV) to a plain-text list. sfeed_twtxt - Format feed data (TSV) to a twtxt feed. sfeed_update - Update feeds and merge items. -sfeed_web - Find urls to RSS/Atom feed from a webpage. +sfeed_web - Find URLs to RSS/Atom feed from a webpage. sfeed_xmlenc - Detect character-set encoding from a XML stream. sfeedrc.example - Example config file. Can be copied to $HOME/.sfeed/sfeedrc. style.css - Example stylesheet to use with sfeed_html(1) and @@ -156,7 +156,7 @@ Files read at runtime by sfeed_update(1) sfeedrc - Config file. This file is evaluated as a shellscript in sfeed_update(1). -Atleast the following functions can be overridden per feed: +At least the following functions can be overridden per feed: - fetch: to use wget(1), OpenBSD ftp(1) or an other download program. - filter: to filter on fields. @@ -190,7 +190,7 @@ man 1 sfeed Usage and examples ------------------ -Find RSS/Atom feed urls from a webpage: +Find RSS/Atom feed URLs from a webpage: url="https://codemadness.org"; curl -L -s "$url" | sfeed_web "$url" @@ -226,7 +226,7 @@ View formatted output in your editor: - - - Example script to view feed items in a vertical list/menu in dmenu(1). It opens -the selected url in the browser set in $BROWSER: +the selected URL in the browser set in $BROWSER: #!/bin/sh url=$(sfeed_plain "$HOME/.sfeed/feeds/"* | dmenu -l 35 -i | \ @@ -252,7 +252,7 @@ argument is optional): - - - The filter function can be overridden in your sfeedrc file. This allows -filtering items per feed. It can be used to shorten urls, filter away +filtering items per feed. It can be used to shorten URLs, filter away advertisements, strip tracking parameters and more. # filter fields. @@ -367,7 +367,7 @@ cut -b is used to trim the "N " prefix of sfeed_plain(1). - - - For some podcast feed the following code can be used to filter the latest -enclosure url (probably some audio file): +enclosure URL (probably some audio file): awk -F '\t' 'BEGIN { latest = 0; } length($8) { @@ -597,7 +597,7 @@ generated ETag to pin and fingerprint a client. - - - -CDN's blocking requests due to a missing HTTP User-Agent request header +CDNs blocking requests due to a missing HTTP User-Agent request header sfeed_update will not send the "User-Agent" header by default for privacy reasons. Some CDNs like Cloudflare don't like this and will block such HTTP @@ -619,7 +619,7 @@ are treated as an error. For example to prevent hijacking an unencrypted http:// to https:// redirect or to not add time of an unnecessary page redirect each time. It is encouraged to -use the final redirected url in the sfeedrc config file. +use the final redirected URL in the sfeedrc config file. If you want to ignore this advise you can override the fetch() function in the sfeedrc file and change the curl options "-L --max-redirs 0". @@ -675,7 +675,7 @@ TSV format. #!/bin/sh # Export newsbeuter/newsboat cached items from sqlite3 to the sfeed TSV format. # The data is split per file per feed with the name of the newsboat title/url. - # It writes the urls of the read items line by line to a "urls" file. + # It writes the URLs of the read items line by line to a "urls" file. # # Dependencies: sqlite3, awk. # @@ -745,7 +745,7 @@ TSV format. "html" "\t" field($5) "\t" field($6) "\t" field($7) \ > fname; - # write urls of the read items to a file line by line. + # write URLs of the read items to a file line by line. if ($10 == "0") { print $3 > "urls"; } diff --git a/sfeed.1 b/sfeed.1 @@ -1,4 +1,4 @@ -.Dd January 1, 2021 +.Dd January 26, 2021 .Dt SFEED 1 .Os .Sh NAME @@ -13,8 +13,8 @@ reads RSS or Atom feed data (XML) from stdin. It writes the feed data in a TAB-separated format to stdout. A .Ar baseurl -can be specified if the links in the feed are relative urls. -It is recommended to always have absolute urls in your feeds. +can be specified if the links in the feed are relative URLs. +It is recommended to always have absolute URLs in your feeds. .Sh TAB-SEPARATED FORMAT FIELDS The items are output per line in a TSV-like format. .Pp @@ -35,7 +35,7 @@ UNIX timestamp in UTC+0, empty if missing or on parse failure. .It title Title text, HTML code in titles is ignored and is treated as plain-text. .It link -Absolute url, unsafe characters are encoded. +Absolute URL, unsafe characters are encoded. .It content Content, can have plain-text or HTML code depending on the content-type field. .It content-type diff --git a/sfeed.5 b/sfeed.5 @@ -29,7 +29,7 @@ UNIX timestamp in UTC+0, empty if missing or on parse failure. .It title Title text, HTML code in titles is ignored and is treated as plain-text. .It link -Absolute url, unsafe characters are encoded. +Absolute URL, unsafe characters are encoded. .It content Content, can have plain-text or HTML code depending on the content-type field. .It content-type diff --git a/sfeed_web.1 b/sfeed_web.1 @@ -1,31 +1,31 @@ -.Dd January 1, 2021 +.Dd January 26, 2021 .Dt SFEED_WEB 1 .Os .Sh NAME .Nm sfeed_web -.Nd finds urls to feeds from a HTML webpage +.Nd finds URLs to feeds from a HTML webpage .Sh SYNOPSIS .Nm .Op Ar baseurl .Sh DESCRIPTION .Nm reads the HTML website as XML or HTML data from stdin and writes the found -urls to stdout. +URLs to stdout. .Sh OPTIONS .Bl -tag -width 8n .It Ar baseurl -Optional base url to use for found feed urls that are relative. +Optional base URL to use for found feed URLs that are relative. .El .Sh OUTPUT FORMAT url<TAB>content\-type<newline> .Bl -tag -width Ds .It url -Found relative or absolute url. +Found relative or absolute URL. .Pp -For relative urls if a <base href="..." /> tag is found it will be used, +For relative URLs if a <base href="..." /> tag is found it will be used, otherwise if the .Ar baseurl -option is specified then that is used, if neither are set then the relative url +option is specified then that is used, if neither are set then the relative URL is printed. .It content\-type Usually application/atom+xml or application/rss+xml. @@ -33,7 +33,7 @@ Usually application/atom+xml or application/rss+xml. .Sh EXIT STATUS .Ex -std .Sh EXAMPLES -Get urls from xkcd website: +Get URLs from xkcd website: .Bd -literal curl -s -L 'http://www.xkcd.com/' | sfeed_web 'http://www.xkcd.com/' .Ed diff --git a/sfeedrc.5 b/sfeedrc.5 @@ -1,4 +1,4 @@ -.Dd January 24, 2021 +.Dd January 26, 2021 .Dt SFEEDRC 5 .Os .Sh NAME @@ -37,13 +37,13 @@ Name of the feed, this is also used as the filename for the TAB-separated feed file. The feedname cannot contain '/' characters, they will be replaced with '_'. .It Fa feedurl -Url to fetch the RSS/Atom data from, usually a HTTP or HTTPS url. +URL to fetch the RSS/Atom data from, usually a HTTP or HTTPS URL. .It Op Fa basesiteurl Baseurl of the feed links. This argument allows to fix relative item links. .Pp According to the RSS and Atom specification feeds should always have absolute -urls, however this is not always the case in practise. +URLs, however this is not always the case in practise. .It Op Fa encoding Feeds are decoded from this name to UTF-8, the name should be a usable character-set for the @@ -58,7 +58,7 @@ is a shellscript each function can be overridden to change its behaviour, notable functions are: .Bl -tag -width Ds .It Fn fetch "name" "url" "feedfile" -Fetch feed from url and writes data to stdout, its arguments are: +Fetch feed from URL and writes data to stdout, its arguments are: .Bl -tag -width Ds .It Fa name Specified name in configuration file (useful for logging). @@ -94,7 +94,7 @@ TSV format. .It Fa name Name of the feed. .It Fa feedurl -Url of the feed. +URL of the feed. .It Fa basesiteurl Baseurl of the feed links. This argument allows to fix relative item links. diff --git a/util.c b/util.c @@ -204,7 +204,7 @@ strtotime(const char *s, time_t *t) if (errno || *s == '\0' || *e) return -1; /* NOTE: assumes time_t is 64-bit on 64-bit platforms: - long long (atleast 32-bit) to time_t. */ + long long (at least 32-bit) to time_t. */ if (t) *t = (time_t)l; diff --git a/xml.c b/xml.c @@ -292,7 +292,7 @@ xml_parse(XMLParser *x) if (c == '!') { /* cdata and comments */ for (tagdatalen = 0; (c = GETNEXT()) != EOF;) { - /* NOTE: sizeof(x->data) must be atleast sizeof("[CDATA[") */ + /* NOTE: sizeof(x->data) must be at least sizeof("[CDATA[") */ if (tagdatalen <= sizeof("[CDATA[") - 1) x->data[tagdatalen++] = c; if (c == '>')