Pretty much about everything: 08/2011

8/29/2011

Showing text to the right of the input box in Key Survey

Today i have seen a really awesome workaround in one of the Key Survey surveys, where the user has placed the text to the right of the Single Line question input box. Kind of like this:

Note that the word "years" appears to the right of the text input box. And here's hot it's done:

Note that the label for the first answer option is empty and that there is a second answer option with two special tags next to it. Tag <newcolumn/> tells the application that the answer option should appear in the new column (typically this tag is used split the answer options of the same question into multiple columns to prevent scrolling for long lists of answer options. Tag <subheader/> tells that this is not the real answer option and that the application should not show the input box next to it. And both tags used together give you a quick way of showing the text to the right of the input box. (P.S. Some CSS tweaking may be required to make the items appear next to each other, but it is not always required).

8/28/2011

Converting HTML pages to PDF in pure Java

There are plenty of commercial HTML to PDF converters for the .NET platform (most of which are based on the Internet Explorer libraries that are available in Windows), but HTML to PDF conversion in Java is not that easy. But not impossible. FlyingSaucer allows to convert properly formatted XHTML and CSS2 to PDF (which is in details described here: http://today.java.net/pub/a/today/2007/06/26/generating-pdfs-with-flying-saucer-and-itext.html). However, unfortunately most of the web pages out there are not designed in XHTML. FlyingSuacer when it has to deal with such pages would simply throw an exception and quit. But there also a solution to this limitation, as the developers of FlyingSaucer suggest, we could use one of the existing HTML code cleaners (for example TagSoup, JTidy or HTMLCleaner) for that purpose. Below I would like to show you an example of using HTMLCleaner, FlyingSaucer and iText to convert the HTML to PDF.

package htmltopdf;

import com.lowagie.text.DocumentException;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.net.URL;
import java.util.logging.Level;
import java.util.logging.Logger;
import org.apache.commons.io.IOUtils;
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.DefaultHttpClient;
import org.htmlcleaner.CleanerProperties;
import org.htmlcleaner.CommentNode;
import org.htmlcleaner.HtmlCleaner;
import org.htmlcleaner.HtmlNode;
import org.htmlcleaner.PrettyXmlSerializer;
import org.htmlcleaner.TagNode;
import org.htmlcleaner.TagNodeVisitor;
import org.htmlcleaner.Utils;
import org.xhtmlrenderer.pdf.ITextRenderer;


public class HTMLtoPDF {

    static int cssCounter = 0;
    
    public static void main(String[] args) {
        try {
            final String site = "http://www.keysurvey.co.za";
            final String page = "/company/";
            final String cssUrl = "http://www.keysurvey.co.za";
            
            URL url = new URL(site+page);

            CleanerProperties props = new CleanerProperties();

// HTMLCleaner part
// set some properties to non-default values
            props.setTranslateSpecialEntities(true);
            props.setTransResCharsToNCR(true);
            props.setOmitComments(true);

// do parsing
            TagNode tagNode = new HtmlCleaner(props).clean(url);
            tagNode.traverse(new TagNodeVisitor() {

                public boolean visit(TagNode tagNode, HtmlNode htmlNode) {
                    if (htmlNode instanceof TagNode) {
                        TagNode tag = (TagNode) htmlNode;
                        String tagName = tag.getName();
                        if ("img".equals(tagName)) {
                            String src = tag.getAttributeByName("src");
                            if (src != null && ! src.startsWith("http")) {
                                tag.setAttribute("src", Utils.fullUrl(site, src));
                            }
                        }
                        if ("link".equals(tagName)) {
                            String rel = tag.getAttributeByName("rel");
                            String href = tag.getAttributeByName("href");
                            if (href != null && "stylesheet".equals(rel)) {
                                try {
                                    HttpClient client = new DefaultHttpClient();
                                    String fullUrl = "";
                                    if (href.startsWith("http")) fullUrl = href;
                                    else fullUrl = Utils.fullUrl(cssUrl, href);
                                    HttpGet get = new HttpGet(fullUrl);
                                    HttpResponse response = client.execute(get);
                                    HttpEntity entity = response.getEntity();
                                    if (entity != null) {
                                        InputStream is = entity.getContent();
                                        href = "css" + cssCounter + ".css";
                                        cssCounter++;
                                        OutputStream os = new FileOutputStream(href);
                                        IOUtils.copy(is, os);
                                    }
                                    tag.setAttribute("href", href);
                                } catch (IOException ex) {
                                    Logger.getLogger(HTMLtoPDF.class.getName()).log(Level.SEVERE, null, ex);
                                }
                            }
                        }
                    } else if (htmlNode instanceof CommentNode) {
                        CommentNode comment = ((CommentNode) htmlNode);
                        comment.getContent().append(" -- By HtmlCleaner");
                    }
                    // tells visitor to continue traversing the DOM tree
                    return true;
                }
            });



// serialize to xml file
            new PrettyXmlSerializer(props).writeToFile(
                    tagNode, "page.xhtml", "utf-8");

// FlyingSaucer and iText part
            String inputFile = "page.xhtml";
            String url2 = new File(inputFile).toURI().toURL().toString();
            String outputFile = "firstdoc.pdf";
            OutputStream os = new FileOutputStream(outputFile);

            ITextRenderer renderer = new ITextRenderer();
            renderer.setDocument(url2);
            renderer.layout();
            renderer.createPDF(os);

            os.close();


        } catch (DocumentException ex) {
            Logger.getLogger(HTMLtoPDF.class.getName()).log(Level.SEVERE, null, ex);
        } catch (IOException ex) {
            Logger.getLogger(HTMLtoPDF.class.getName()).log(Level.SEVERE, null, ex);

        }
    }
}

FlyingSaucer cannot read the CSS files over the web, so we have to save them locally. While parsing the HTML source of the page, for all of the link tags with rel='stylesheet', we need to save a local copy of the css file. Also relative links to the image files need to be replaced with the absolute URLs.

To compile and run the above example you will need the Jars supplied with the FlyingSaucer distribution (it includes XHTMLRenderer and iText), Apache HttpClient, Apache IOUtils and HTMLCleaner.

Overall the code works fairly well on various even poorly formatted pages. There are a few issues that I have noted that may prevent some pages from being rendered to PDF correctly. Here are some of them:
1. Some types of comments (i.e. .ClassName /* some comment */ { ... } ) are not supported by the CSS parser of FlyingSaucer . It simply stops parsing CSS without throwing any exceptions.
2. @import and url() properties of the CSS are not supported.

But overall for many pages the described solution should be goo enough, especially when you need to convert internal pages that you can properly prepare for the PDF conversion.

Добавление кнопки ВКонтакте "Мне нравится" в Blogger

Просмотрел сегодня ряд постов, где люди описывали свои варианты добалвения кнопки ВКонтакте "Мне нравится" к Blogger, но не нашел ни одно 100% работающего варианта. Вернее работающие варинаты сводились к тому, что кнопка работает, когда на одной странице отображается только один пост, в противном сулчае кнопка просто пряталсь. Проблема в том, что скрипт, который по умолчанию предлагает ВКонтакте расчитан на то, что на странице может быть только одна кнопка "Мне нравится".

Итак, вот пошаговая инструкция о том, как показывать кнопку "Мне нравиится" для каждого поста, если у Вас показывается несколько постов на странице.

1. Идем на страницу виджета "Мне нравится" вокнтакте: http://vkontakte.ru/developers.php?o=-1&p=Like и генерируем код с необходимыми настройками.
2. Открываем страницу редактиорвания шаблона в Blogger: Design > Edit HTML. Включаем галочку Expand widget templates:

3. Сразу после тэга вставляем первую часть кода, сгенерированную ВКонтакте. Должно получится нечто вроде:

4. Находим к коде шаблона строку
<div class="post-footer-line post-footer-line-3">. Ниже этой строки вставляем следюущий код:

Все, теперь внизу каждого вашего поста будет показвать кнопка и работать кнопка "Мне нравится".

Pretty much about everything

8/29/2011

Showing text to the right of the input box in Key Survey

8/28/2011

Converting HTML pages to PDF in pure Java

Добавление кнопки ВКонтакте "Мне нравится" в Blogger

Links

Archieve

About Me