com.q2learning.webtier.webhuddle
Class HtmlUtils

java.lang.Object
  extended by com.q2learning.webtier.webhuddle.HtmlUtils

public class HtmlUtils
extends java.lang.Object

HtmlUtils: General purpose utilities for parsing HTML text.

11/20/2007 Charles Roth. First version.

Copyright (C) 2007 Q2learning LLC, www.q2learning.com. All rights reserved.


Method Summary
static java.lang.String getFieldInTagInHtml(java.lang.String fieldName, java.lang.String tagName, java.lang.String htmlText, int startPos)
          Get the value of a particular field in the 1st tag of a given name from HTML text.
static java.lang.String getFieldNamed(java.lang.String fieldName, java.lang.String tagText)
          Pluck out the value of a given field from inside an HTML tag, which might have been returned from getTagAt().
static int getNextTagPosition(java.lang.String tagName, java.lang.String htmlText, int startPos)
          Get the starting position of the next tag with a particular name.
static java.lang.String getTagAt(java.lang.String htmlText, int openTagPos)
          Pluck an entire tag from a given starting position (typically found from getNextTagPosition().
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

getNextTagPosition

public static int getNextTagPosition(java.lang.String tagName,
                                     java.lang.String htmlText,
                                     int startPos)
Get the starting position of the next tag with a particular name. Typically used in a loop to keep finding the "next" tag (with that name).

Parameters:
tagName - Example: "a" to look for an "htmlText -
startPos - Start looking at this position.
Returns:
position of start of desired tag.

getTagAt

public static java.lang.String getTagAt(java.lang.String htmlText,
                                        int openTagPos)
Pluck an entire tag from a given starting position (typically found from getNextTagPosition().

Parameters:
htmlText -
openTagPos - Tag starts at this position.
Returns:
entire text of tag, without the leading "<" and trailing ">".

getFieldNamed

public static java.lang.String getFieldNamed(java.lang.String fieldName,
                                             java.lang.String tagText)
Pluck out the value of a given field from inside an HTML tag, which might have been returned from getTagAt().

Parameters:
fieldName - Name of field, e.g. "href".
tagText - Text of entire tag.
Returns:
value of field 'fieldName'.

getFieldInTagInHtml

public static java.lang.String getFieldInTagInHtml(java.lang.String fieldName,
                                                   java.lang.String tagName,
                                                   java.lang.String htmlText,
                                                   int startPos)
Get the value of a particular field in the 1st tag of a given name from HTML text.

Parameters:
fieldName - Name of field to get (e.g. "href")
tagName - Name of tag to get (e.g. "a" looks for an "" tag)
htmlText -
startPos - Look for tag and field, starting at this position.
Returns:
value of desired field.