Código:
Dónde el método readParagraphs:fs = new POIFSFileSystem(new FileInputStream("c:\\Data.docx"));
HWPFDocument doc = new HWPFDocument(fs);
readParagraphs(doc);
Código:
Al ejecutarlo me salta ésta excepción:public static void readParagraphs(HWPFDocument doc) throws Exception{
WordExtractor we = new WordExtractor(doc);
/**Get the total number of paragraphs**/
String[] paragraphs = we.getParagraphText();
System.out.println("Total Paragraphs: "+paragraphs.length);
for (int i = 0;i < paragraphs.length; i++)
{
System.out.println("Length of paragraph "+(i +1)+": "+ paragraphs[i].length());
System.out.println(paragraphs[i].toString());
}
}
Código:
Alguien sabe cómo arreglarlo?? org.apache.poi.poifs.filesystem.OfficeXmlFileException: The supplied data appears to be in the Office 2007+ XML. You are calling the part of POI that deals with OLE2 Office Documents. You need to call a different part of POI to process this data (eg XSSF instead of HSSF) at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:131) at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:104) at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:138) at anagram.Anagram.main(Anagram.java:55)
He conseguido leer el documento usando el método getText del XWPFWordExtractor, pero necesito extraer los datos línea a línea para procesarlos, hay algún otro método?
Saludos y gracias!


