For example i will show you how to extract latest news from http://www.skali.net/web/guest/news and sort its headlines into the list.
![]() |
Skali.Net News Page |
Fisrt we need a very php basic code to obtain a copy of Skali News. For basic lesson i will using this code:
<?php $data = file_get_contents(
'http://www.skali.net/web/guest/news'
); ?>
file_get_contents is quick code to grab remoted site as string code. $data variable containing Skali News html source code.
Next, we need to parsing its headlines. For basic i will using Regex method. Before using Regex, we need to know which pattern of News Headline and what its unique angle compare to others htmls code.
![]() |
Skali News Html Source Code |
Refer to the screen snapshot, i have highlighted the text pattern that describing Skali news headlines.
As example:
<span style="font-size: larger;">EmbunWeb.com Are at Wordcamp Malaysia!</span>
My Regex pattern should looks like this to match all headlines:
#<span style="font-size\: larger;">(.*?)<\/span>#is
The Regex pattern need to match just between <span> with font larger style. The matched pattern need to dump as array list to make sure its success.
<?php $data = file_get_contents('http://www.skali.net/web/guest/news'); $regex = '#<span style="font-size\: larger;">(.*?)<\/span>#is'; preg_match_all($regex,$data,$match); echo '<pre>'; echo print_r($match[1]); echo '</pre>'; ?>
Here how output using my localhost testing system:
![]() |
Weehee...we got it!! |
Array ( [0] => EmbunWeb.com Are at Wordcamp Malaysia! [1] => Kepakaran tempatan memacu MEB Oleh Aimi Aizal Nasharuddin [2] => Govt websites have markedly improved: MDeC [3] => Skali sees 20% growth in web hosting revenue [4] => Utusan Malaysia - Program SPIKE lahirkan Aisoft Solution [5] => The Edge Malaysia - Sure and Steady Skali [6] => Skali Looks Forward To Stage 2 Of MPS Project Feb 8 2010 )
No comments:
Post a Comment