使用Select.HtmlToPdf 把html內容生成pdf文件

、nuget 引用

Select.HtmlToPdf

2、方法

using SelectPdf;using System.Collections.Specialized;using System.IO;using System.Web;
namespace BQoolCommon.Helpers.File{ public class WebToPdf { public WebToPdf() { //SelectPdf.GlobalProperties.LicenseKey = "your-license-key"; }
 /// <summary> /// 將 Html 轉成 PDF，並儲存成檔案 /// </summary> /// <param name="html">html</param> /// <param name="fileName">絕對路徑</param> public void SaveToFileByHtml(string html, string fileName) { var doc = SetPdfDocument(html); doc.Save(fileName); }
 /// <summary> /// 傳入 Url 轉成 PDF，並儲存成檔案 /// </summary> /// <param name="url">url</param> /// <param name="fileName">絕對路徑</param> /// <param name="httpCookies">Cookies</param> public void SaveToFileByUrl(string url, string fileName, NameValueCollection httpCookies) { var doc = SetPdfDocument(url, httpCookies); doc.Save(fileName); }
 /// <summary> /// 將 Html 轉成 PDF，並輸出成 byte[] 格式 /// </summary> /// <param name="html">html</param> /// <returns></returns> public byte[] GetFileByteByHtml(string html) { var doc = SetPdfDocument(html); return doc.Save(); }
 /// <summary> /// 傳入 Url 轉成 PDF，並輸出成 byte[] 格式 /// </summary> /// <param name="url">url</param> /// <param name="httpCookies">Cookies</param> /// <returns></returns> public byte[] GetFileByteByUrl(string url, NameValueCollection httpCookies) { var doc = SetPdfDocument(url, httpCookies); return doc.Save(); }
 /// <summary> /// 將 Html 轉成 PDF，並輸出成 Stream 格式 /// </summary> /// <param name="html">html</param> /// <returns></returns> public Stream GetFileStreamByHtml(string html) { var doc = SetPdfDocument(html); var pdfStream = new MemoryStream();
 doc.Save(pdfStream); pdfStream.Position = 0;
 return pdfStream; }
 /// <summary> /// 傳入 Url 轉成 PDF，並輸出成 Stream 格式 /// </summary> /// <param name="html">html</param> /// <returns></returns> public Stream GetFileStreamByUrl(string url, NameValueCollection httpCookies) { var doc = SetPdfDocument(url, httpCookies); var pdfStream = new MemoryStream();
 doc.Save(pdfStream); pdfStream.Position = 0;
 return pdfStream; }
 private PdfDocument SetPdfDocument(string html) { var converter = new HtmlToPdf();
 converter.Options.WebPageWidth = 1200; html = HttpUtility.HtmlDecode(html);
 return converter.ConvertHtmlString(html); }
 private PdfDocument SetPdfDocument(string url, NameValueCollection httpCookies) { var converter = new HtmlToPdf(); converter.Options.WebPageWidth = 1200;
 if (httpCookies !=  && httpCookies.Count != 0) { converter.Options.HttpCookies.Add(httpCookies); }
 return converter.ConvertUrl(url); }
 }}

HTML標簽相關的字符串格式化

string nl2br ( string $string )

nl2br() 就是將\n 替換成 <br> //javascript對\n才能夠執行換行,對</br>是不能執行換行

htmlspecialchars() 把一些預定義的字符轉換為 HTML 實體。

string htmlspecialchars(string,quotestyle,[character-set])

轉換以下字符及對應的實體

& （和號） 成為 &
" （雙引號） 成為 "
' （單引號） 成為 '
< （小于） 成為 <
> （大于） 成為 >

第二個參數: ENT_COMPAT 只轉換雙引號, 保留單引號, 為默認值 compat: 兼容性

ENT_QUOTES 同時轉換兩種引號 quotes: 引號

ENT_NOQUOTES 不對引號進行轉換

<html>
<body>
<?php
$str = "John & \" 'Adams'";
echo htmlspecialchars($str, ENT_COMPAT);
echo "<br />";
echo htmlspecialchars($str, ENT_QUOTES);
echo "<br />";
echo htmlspecialchars($str, ENT_NOQUOTES);
?>
</body>
</html>

輸出結果:John & " 'Adams'

John & " 'Adams'

htmlentities() 可以將所有的非ASCII碼字符轉換為對應的實體代碼;除字母、數字、\外, 漢字和鍵盤上其他字符都轉換

<?php
$str = "A 'quote' \" is <b>bold</b>" ;
echo htmlentities ( $str ); // 輸出后源代碼: A 'quote' is <b>bold</b>
echo htmlentities ( $str , ENT_QUOTES ); // 輸出后源代碼: A 'quote' is <b>bold</b>
?>

返回的結果:A 'quote' "is <b>bold</b>

A 'quote' "is <b>bold</b>

注意: htmlspecialchars()和htmlentities作用直接輸出HTML腳本

htmlspecialchars()和htmlentities()函數對于轉義字符"\"處理,不會轉義實體代碼,要么當轉義字符對待,要么原樣輸出;

PHP中htmlentities和htmlspecialchars的區別

這兩個函數的功能都是轉換字符為HTML字符編碼, 特別是url和代碼字符串。防止字符標記被瀏覽器執行。

使用中文時沒什么區別, 但htmlentities會格式化中文字符使得中文輸入是亂碼。

htmlentities轉換所有的html標記, htmlspecialchars只格式化& ' " < 和 > 這幾個特殊符號

addslashes() 在指定的預定義字符前添加反斜杠。

這些預定義字符是：單引號 (') 雙引號 (") 反斜杠 (\) NULL字符(\x00)

提示：該函數可用于為存儲在數據庫中的字符串以及數據庫查詢語句準備合適的字符串。

注釋：默認情況下，PHP 指令 magic_quotes_gpc 為 on，對所有的 GET、POST 和 COOKIE數據自動運行 addslashes()。

不要對已經被magic_quotes_gpc轉義過的字符串使用 addslashes()，因為這樣會導致雙層轉義。

遇到這種情況時可以使用函數 get_magic_quotes_gpc() 進行檢測。(如:$c=(!get_magic_quotes_gpc())?addslashes($c):$c;)

在本例中，我們要向字符串中的預定義字符添加反斜杠：

<?php
$str = "Who's John Adams?";
echo $str . " This is not safe in a database query.<br />";
echo addslashes($str) . " This is safe in a database query.";
?>

輸出：

Who's John Adams? This is not safe in a database query.

Who\'s John Adams? This is safe in a database query.

<?php
header("Content-type:text/html; charset=utf-8");
$str = "wo are \x0a studying \x00 php";
echo $str;
echo "<br>";
echo addslashes($str);
?>

輸出:

wo are studying php

wo are studying >wo are studying \0 php< php

stripslashes() 刪除反斜線（"\"）

在提交的表單數據中 ' " \ 等字符前被自動加上一個\ ,這是配置文件php.ini中選項magic_quotes_gpc在起作用，

默認是打開的，如果不處理則將數據保存到數據庫時，有可能會被數據庫誤當成控制符號而引起錯誤。

通常htmlspecialchars()和stripslashes()函數復合的方式，聯合處理表單中的提交的數據htmlspecialchars(stripslashes())

strip_tags()

string strip_tags ( string $str [, string $allowable_tags ] )

剝去 HTML、XML 以及 PHP 的標簽。

<?php
echo strip_tags("Hello <b><i>world!</i></b>","<b>");
?>

輸出結果:Hello world!

實例:

<?php
$str = "<b>webserver;</b> & \ 'Linux' & Apache";
echo "$str"; //直接輸出
echo "<br/>";
echo htmlspecialchars($str,ENT_COMPAT); //只轉換雙引號,為默認參數
echo "<br />";
echo htmlspecialchars($str,ENT_NOQUOTES); //不對引號進行轉換
echo "<br />";
echo htmlspecialchars($str,ENT_QUOTES); //同時轉換單引號和雙引號
echo "<br />";
echo htmlentities($str); //將所有的非ASCII碼字符轉換為對應的實體代碼
echo "<br />";
echo addslashes($str); //將" ' \ 字符前添加反斜線
echo "<br />";
echo stripslashes($str); //刪除反斜線
echo "<br />";
echo strip_tags($str); //刪除<html>標記
?>

輸出結果:

webserver; & \ 'Linux' & Apache

要使用的是wkhtmltopdf的Python封裝——pdfkit

安裝

1. Install python-pdfkit:

$ pip install pdfkit

2. Install wkhtmltopdf:

Debian/Ubuntu:

$ sudo apt-get install wkhtmltopdf

Redhat/CentOS

sudo yum intsall wkhtmltopdf

MacOS

brew install Caskroom/cask/wkhtmltopdf

使用

一個簡單的例子:

import pdfkit

pdfkit.from_url('http://google.com', 'out.pdf')

pdfkit.from_file('test.html', 'out.pdf')

pdfkit.from_string('Hello!', 'out.pdf')

你也可以傳遞一個url或者文件名列表:

pdfkit.from_url(['google.com', 'yandex.ru', 'engadget.com'], 'out.pdf')

pdfkit.from_file(['file1.html', 'file2.html'], 'out.pdf')

也可以傳遞一個打開的文件:

with open('file.html') as f:

pdfkit.from_file(f, 'out.pdf')

如果你想對生成的PDF作進一步處理，你可以將其讀取到一個變量中:

# 設置輸出文件為False，將結果賦給一個變量

pdf = pdfkit.from_url('http://google.com', False)

你可以制定所有的 wkhtmltopdf 選項 http://wkhtmltopdf.org/usage/wkhtmltopdf.txt. 你可以移除選項名字前面的 '--' .如果選項沒有值, 使用None, False or * 作為字典值:

options = {

'page-size': 'Letter',

'margin-top': '0.75in',

'margin-right': '0.75in',

'margin-bottom': '0.75in',

'margin-left': '0.75in',

'encoding': "UTF-8",

'no-outline': None

}

pdfkit.from_url('http://google.com', 'out.pdf', options=options)

默認情況下, PDFKit 將會顯示所有的 wkhtmltopdf 輸出. 如果你不想看到這些信息，你需要傳遞一個 quiet 選項:

options = {

'quiet': ''

}

pdfkit.from_url('google.com', 'out.pdf', options=options)

由于wkhtmltopdf的命令語法 , TOC 和 Cover 選項必須分開指定:

toc = {

'xsl-style-sheet': 'toc.xsl'

}

cover = 'cover.html'

pdfkit.from_file('file.html', options=options, toc=toc, cover=cover)

當你轉換文件、或字符串的時候，你可以通過css選項指定擴展的 CSS 文件。

# 單個 CSS 文件

css = 'example.css'

pdfkit.from_file('file.html', options=options, css=css)

# Multiple CSS files

css = ['example.css', 'example2.css']

pdfkit.from_file('file.html', options=options, css=css)

你也可以通過你的HTML中的meta tags傳遞任意選項：

body = """

<html>

<head>

</head>

Hello World!

</html>

"""

pdfkit.from_string(body, 'out.pdf') #with --page-size=Legal and --orientation=Landscape

配置

每個API調用都有一個可選的參數。這應該是pdfkit.configuration()API 調用的一個實例. 采用configuration 選項作為初始化參數。可用的選項有:

wkhtmltopdf ——wkhtmltopdf二進制文件所在的位置。默認情況下pdfkit 會嘗試使用which (在類UNIX系統中) 或 where (在Windows系統中)來判斷
meta_tag_prefix -- pdfkit的前綴指定 meta tags（元標簽） - 默認情況是pdfkit-

示例：針對wkhtmltopdf不在系統路徑中（不在$PATH里面)

PATH里面）:

config = pdfkit.configuration(wkhtmltopdf='/opt/bin/wkhtmltopdf'))

pdfkit.from_string(html_string, output_file, configuration=config)

問題

IOError:'No wkhtmltopdf executable found':

確保 wkhtmltopdf 在你的系統路徑中（PATH），會通過 configuration進行了配置 (詳情看上文描述)。在Windows系統中使用where wkhtmltopdf命令或在 linux系統中使用 which wkhtmltopdf 會返回 wkhtmltopdf二進制可執行文件所在的確切位置.

IOError: 'Command Failed'

如果出現這個錯誤意味著 PDFKit不能處理一個輸入。你可以嘗試直接在錯誤信息后面直接運行一個命令來查看是什么導致了這個錯誤（某些版本的 wkhtmltopdf會因為段錯誤導致處理失敗）

正常生成，但是出現中文亂碼

確保兩項：

1）、你的系統中有中文字體

2）、在html中加入

下面是我隨便寫的一個HTML表格：

<html>

<body>

<tr>

</tr>

<tr>

</tr>

<tr>

</tr>

<tr>

</tr>

<tr>

<th align="left">tOTAL</th>

</tr>

</table>

</body>

</html>

下面是生成的PDF截圖

在線咨詢

上一篇：使用JavaScript構建樹形圖
下一篇：手把手教你常見的CSS布局方式「實踐」

您的項目需求

*請認真填寫需求信息，我們會在24小時內與您取得聯系。

整合營銷服務商

使用Select.HtmlToPdf 把html內容生成pdf文件

您的項目需求