JSoup : Getting data in HTML table

Do you want to get your hands dirty, parsing your HTML table ? Of course not. Jsoup can do it for you.  Assume that you have a table with 2 columns like this :

<html>
<head><title>First parse</title>
</head>
<body>
<p>Parsed HTML into a doc.</p>
<table>
<tr><td>satu</td><td>satu-1</td></tr><tr><td>dua</td><td>dua-1</td></tr><tr><td>tiga</td><td>tiga-1</td></tr>
</table>
</body>
</html>

Easily, you can use JSoup to get html data. Below is Java program that use JSoup to get data in HTML table :

/*
 * Simple Jsoup Example
 */
package jsoup;

import java.util.Iterator;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;

/**
 *
 * @author panji
 */
public class SimpleJsoup {

    public static void main(String[] args) {
        String html = "<html><head><title>First parse</title></head>"
                + "<body><p>Parsed HTML into a doc.</p>"
                + " <table><tr><td>satu</td><td>satu-1</td></tr><tr><td>dua</td><td>dua-1</td></tr><tr><td>tiga</td><td>tiga-1</td></tr></table> "
                + "</body></html>";
        Document doc = Jsoup.parse(html);
        Element table = doc.select("table").first();
        Iterator<Element> iterator = table.select("td").iterator();
        while(iterator.hasNext()){
            System.out.println("text : "+iterator.next().text()); //kolom -1
            System.out.println("text : "+iterator.next().text()); //kolom -2
        }
        String title = doc.title();
        System.out.println("Document title : "+title);
    }
}

In Java code above, in each iteration we can get each column data. So, it’s simple, right ?

3 comments so far

  1. Andy on

    In line 25, “ite” should be “iterator”

  2. Vikrant on

    Thanks


Leave a comment