dev-resources.site
for different kinds of informations.
PHP Doc or Docx Word File to TXT extract Content Example
Published at
2/17/2020
Categories
php
doc
docx
Author
kevinmel2000
Author
12 person written this
kevinmel2000
open
class Doc2Txt {
private $filename;
public function __construct($filePath) {
$this->filename = $filePath;
}
private function read_doc() {
$fileHandle = fopen($this->filename, "r");
$line = @fread($fileHandle, filesize($this->filename));
$lines = explode(chr(0x0D),$line);
$outtext = "";
foreach($lines as $thisline)
{
$pos = strpos($thisline, chr(0x00));
if (($pos !== FALSE)||(strlen($thisline)==0))
{
} else {
$outtext .= $thisline." ";
}
}
$outtext = preg_replace("/[^a-zA-Z0-9\s\,\.\-\n\r\t@\/\_\(\)]/","",$outtext);
return $outtext;
}
private function read_docx(){
$striped_content = '';
$content = '';
$zip = zip_open($this->filename);
if (!$zip || is_numeric($zip)) return false;
while ($zip_entry = zip_read($zip)) {
if (zip_entry_open($zip, $zip_entry) == FALSE) continue;
if (zip_entry_name($zip_entry) != "word/document.xml") continue;
$content .= zip_entry_read($zip_entry, zip_entry_filesize($zip_entry));
zip_entry_close($zip_entry);
}// end while
zip_close($zip);
$content = str_replace('</w:r></w:p></w:tc><w:tc>', " ", $content);
$content = str_replace('</w:r></w:p>', "\r\n", $content);
$striped_content = strip_tags($content);
return $striped_content;
}
public function convertToText() {
if(isset($this->filename) && !file_exists($this->filename)) {
return "File Not exists";
}
$fileArray = pathinfo($this->filename);
$file_ext = $fileArray['extension'];
if($file_ext == "doc" || $file_ext == "docx")
{
if($file_ext == "doc") {
return $this->read_doc();
} else {
return $this->read_docx();
}
} else {
return "Invalid File Type";
}
}
}
call class example :
$docObj = new Doc2Txt($inputfile);
$txt = $docObj->convertToText();
docx Article's
18 articles in total
UZI -> Find and replace text in multiple files(docx,xlsx,pptx..)
read article
How to Convert Docx File to PDF in C#
read article
Search & Replace Texts in DOCX
read article
Comment convertir entre les formats Word DOCX et DOC avec Python
read article
How to Create a Word Document from Scratch via Python Code
read article
How to extract text and image from word in Java applications
read article
docx to pdf with Node.js
read article
How to Turn On Track Changes, Accept or Reject Changes in Word in Java
read article
How to Dynamically Fill Data in Word Table Using Java
read article
Vue and Docx file
read article
How to read/write Word docx files in Python
read article
Best way to convert Word docx to ASCII doc by Pandoc w/o loosing styles
read article
How to parse and map a Docx file with Java
read article
First Project: Wine List Parser
read article
PHP Doc or Docx Word File to TXT extract Content Example
currently reading
Render dynamically a .docx file with JavaScript
read article
Extracting images from a document using Aspose.Words Cloud API (C# / .NET)
read article
Uploading a document to Aspose.Words Cloud Storage (C# / .NET)
read article
Featured ones: