The online racing simulator
Developer forum posts as RSS
(55 posts, started )
Developer forum posts as RSS
Although there are plenty of interesting discussions at this forum my main reason for coming here is to see what Scawen is up to. When the new forum was initially released I spent a couple of minutes on a userscript to highlight the developers' posts, but I figured I could do a little better.

So I wrote a little PHP-scraper that visits the profiles of the devs and collect their most recent posts and then publish the data in a RSS feed.

The feed is available here: https://www.liveforspeed.se/lfsdev-rss/

... And if you'd like to extend on this here's the scraper:


<?php

/**
* Return the most recent posts by given user at lfsforum.net.
* Breakage-prone due to DOM-parsing! :)
*
* @author felplacerad
* @version 0.1
*/
class LfsForumScraper
{
public $posts;
protected $dom;
protected $cache;

/**
* Create a new DOMDocument object, and
* Read cache file into an array.
*/
public function __construct()
{
if (is_file('./scraper.cache') && time() - filemtime('./scraper.cache') < 600) die("No hammering!\n");

$this->dom = new DOMDocument();
libxml_use_internal_errors(true); // Disable libxml errors
$this->cache = (file_exists('./scraper.cache') ? file('./scraper.cache', FILE_IGNORE_NEW_LINES) : []);
}

/**
* Load HTML, evaluate XPath expressions and sanitize the input a bit
* (ie: remove element attributes and most tags)
* store seen post ids in cache file.
* Return posts that wasn't already seen.
*/
public function scrapeAuthor($targetAuthor = 'Scawen')
{
$url = "https://www.lfs.net/forum/-1/search/user:'$targetAuthor'";
$opts = array('http'=>array('header'=>"User-Agent: fel-notify/0.1"));
$context = stream_context_create($opts);

$this->dom->loadHTML(file_get_contents($url, false, $context));
$xpath = new DOMXPath($this->dom);

// Example: <div class="FPost">
$tags = $xpath->query('//div[@class="FPost"]');

foreach ($tags as $tag) {
$id = $xpath->query('./div[contains(@id, "Post")]', $tag)->
item(0)->getAttribute('id');

if (!in_array($id, $this->cache)) {
$topic = $xpath->query('./div/a', $tag)->
item(0)->nodeValue;

$tlink = $xpath->query('./div/a', $tag)->
item(0)->getAttribute('href');

$plink = $xpath->query('./div[@class="FPostHeader"]/div/a', $tag)->
item(0)->getAttribute('href');

$text = $this->dom->saveXML($xpath->query('./div/div/div[@class="FPostText"]/node()', $tag)->
item(0)->parentNode);

$author = $xpath->query('./div/div[@class="FUserInfo"]/a[@class="UserLink"]', $tag)->
item(0)->nodeValue;

$alink = $xpath->query('./div/div[@class="FUserInfo"]/a[@class="UserLink"]', $tag)->
item(0)->getAttribute('href');

$datetime = $xpath->query('./div[@class="FPostHeader"]/div/time', $tag)->
item(0)->getAttribute('datetime');

if ($author === $targetAuthor) { // LFS Forum may yield false results due to wildcard matches
$this->posts[$id]['id'] = $id;
$this->posts[$id]['datetime'] = date(DATE_RFC2822, (strtotime($datetime)));
$this->posts[$id]['author'] = $author;
$this->posts[$id]['topic'] = htmlspecialchars($topic);
$this->posts[$id]['alink'] = $alink;
$this->posts[$id]['tlink'] = $tlink;
$this->posts[$id]['plink'] = $plink;
$this->posts[$id]['text'] = preg_replace("/<([a-z][a-z0-9]*)[^>]*?(\/?)>/i",'<$1$2>',
strip_tags($text, '<div><p><a><fieldset><legend>'));
}

$ids[] = $id;
}
}

if (isset($ids) && count($ids) > 0) {
file_put_contents('./scraper.cache', "\n" . implode("\n", $ids), FILE_APPEND);
}

return $this->posts;
}

}

$scraper = new LfsForumScraper;

$posts = $scraper->scrapeAuthor('Scawen');
$posts = $scraper->scrapeAuthor('Victor');
$posts = $scraper->scrapeAuthor('Eric');

print_r($posts);

(Vic is OK wit this, I checked ...)
Nice thanks!
And oh: If there's interest I could pull the LFS Racing tweets into the feed as well.

Oops, noticed that the sorting order was mixed up. Give me a second to fix this
#4 - troy
Neat, subbed!
I was just reading stuff until now
Really nice job Sir! Subbed & enjoying!
#6 - herki
Neat. Now I can be inactive on the forums and still see if there's a new patch out.
Added LFSRacing tweets to the RSS as well (but no retweets, mentions or stuff like that).
#8 - troy
Any chance we could be sent directly to the post in question if we follow the RSS feed? It now seems to link to the beginning of the thread for me.
Already happens when I click the link.
#10 - troy
Mh, does not happen for me, I just get to the start of a thread.

edit: looks like a problem with the RSS addon I use for firefox. Nevermind.
What addon? I could have a look to see if its fixable.
#12 - troy
Its that one: https://addons.mozilla.org/en-US/firefox/addon/rss-ticker/

Looks like it cuts it at the "#"

"http://nyc.rly.nu/lfsdev-rss/":{"uri":"http://nyc.rly.nu/lfsdev-rss/","siteUri":"http://nyc.rly.nu/","label":"lfsdev-rss","image":"http://nyc.rly.nu/favicon.ico","items":[{"url":"https://www.lfs.net/forum/post/1872107","label":"Scawen posted in Last call for translations - 0.6G full version tomorrow!","description":"I've made it drop two lines if it's going to draw the OK button on that Health and Safety Warning. I guess it's only English that is compact enough to fit in that space, so it's better this way. Those ones on the side in that screen are not translatable, could have been but I won't add them now.\r\n",

Yeah, no. I can't fix that. I figured perhaps I had forgotten to wrap htmlentities/htmlspecialchars around the link but it's fine. Im quite sure # is a valid URL character anyways ... To be sure I tried putting &#35; in there instead but nope.

Actually, the feed even validates (yay!).

So I'm out of ideas. Probably a bug in the ticker. :\
#14 - troy
I'll post on that dudes website, thanks for looking into it.
The site seems to be down for me.
Quote from Flame CZE :The site seems to be down for me.

Ah, yes -- It is down. I got into an argument with Digital Ocean in the beginning of April and the site has been down since then. I didn't think anyone was using the feed so I did not bother bringing it up again.

But since you reacted ... I should still have the code and the database so I'll bring it up again (somewhere else) when I return from my holiday trip! Smile

T
Any updates on this?
Yes we definitely need this, otherwise difficult to kep track of whats going on Razz Wink
Sorry guys, I forgot about this.

New URI is https://www.liveforspeed.se/lfsdev-rss/ (old URI will begin to redirect in a few minutes).

Edit: Also, the twitter integration didn't seem to work anymore so I disabled it for now (it was mostly just excerpts of what was posted on the forum/website anyway so I think we'll live).
Thanks Thumbs up
Ah, yes! Heh, I saw that post, too.

The database had grown quite large over the years (2000+ entries) so I figured the readers struggled a little bit when parsing and formatting that many entries. So I added a default limit to show only the last 50 entries. I also added "pagination" support while I was at it.

So, if you like you may override this default by adding limit=n and/or offset=n parameters to the request, (e.g. https://www.liveforspeed.se/lfsdev-rss/?limit=100&offset=0).
Thank you very much Smile
The site is down temporarily. It'll be back up next week. Smile

Edit: Should be up again. Sorry about the hickup. Smile
Gah, the SSL/TLS certificate expired almost a month ago and I forgot to renew it. All fixed now. Sorry about this.

Edit: Crontab added. We'll see in three (~) months if it worked (note to self: Thursday, December 19, 2019). Tongue

Developer forum posts as RSS
(55 posts, started )
FGED GREDG RDFGDR GSFDG