I’m a member of an American Legion and as such I’ve been working with them on displaying images on screens for schedules and such. So for a while I’ve been using various programs to capture an image from a website and save it to a jpg file. So that got me to thinking there has to be a way to do this in Script. So this article is about how I did just that.
First thing is I needed to find an easy way to bring in a webpage / html into memory for conversion to a Jpg. After doing much searching I found this nice handy dandy module called NReco. Now that I have a Dll that I can import I can add this to my PowerShell script by doing an add-type:
I chose for simplicity to put the dll in a sub folder where my script resides. Now that I have the dll imported now on to seeing what the DLL can do for me. According to the article the dll will convert an html to a jpg in one line of code. So what I chose to do is take advantage of the Invoke-WebRequest and just point it to www.powershell.org to see if it’d save a page for it.
$html = invoke-webrequest -uri 'https://powershell.org/forums/' $h2image = new-object NReco.ImageGenerator.HtmlToImageConverter $imageFormat = [NReco.ImageGenerator.ImageFormat]::Jpeg $jpeg = $h2image.generateImage($html, $imageformat) $dataStream = New-Object System.IO.MemoryStream(,$jpeg) $img = [System.Drawing.Image]::FromStream($dataStream) $img.save('c:\temp\image.jpg')
So the $h2image this is an object of the dll we pulled in which allows us to convert the webpage to a Jpg. Depending on the size of the page it may take a little while for this function to return.
$h2image = new-object NReco.ImageGenerator.HtmlToImageConverter
The next line of code the image format this tells the Dll what type of file we want to save it to. Through intellisense in the ISE you’ll notice there are 3 types included in this Enumeration.
For what I needed I chose JPG.
Now that I have the type of file and the type added I can now stream this webpage into memory:
$dataStream = New-Object System.IO.MemoryStream(,$jpeg)
This one took me a while to figure out if it hadn’t been for this article I may have never figured it out: http://piers7.blogspot.com/2010/03/3-powershell-array-gotchas.html
solution for getting the array to be streamed is in this tidbit:
Cup(Of T): 3 PowerShell Array Gotchas
4 thoughts on “Capture Web Page / HTML to JPG”
Thanks! How do you install the NReco Module? I searched PowerShell with Get-PackageSource but could not find it. I’m assuming there must be another way.
Download the Nuget package and extract the dll from it using traditional unzip methods on the nuget file
Thanks for your post, I failed in the row:
$jpeg = $h2image.generateImage($html, $imageformat)
with error that wkhtmltoimage is needed (in path: C:\Windows\System32\WindowsPowerShell\v1.0\wkhtmltoimage)
How can I install this?
Have you downloaded and unzipped the package as described in the article: https://www.nuget.org/packages/NReco.ImageGenerator/