This is Interesting: Free Magazines for Graphics designers and webmasters  


Home > Archive > Computer Graphics with Photoshop > June 2004 > Help for a scanned document





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Help for a scanned document
Ben

2004-06-02, 12:14 pm

Hi there !

Could somebody help me please ?

I try to do extract text for a document with FineReader.
It works fine with black text on white background.
The problem is that some parts of the document have black text on a kind of
gray background.
In fact this gray background is made of small black dots, like the text, and
the OCR can get it.

Is it possible to remove the background with Photoshop to have only the text
left ?

Hope my english is understandable.

Thanks.

Ben.


L. McKenzie

2004-06-02, 12:14 pm

Ben wrote:

> Hi there !
>
> Could somebody help me please ?
>
> I try to do extract text for a document with FineReader.
> It works fine with black text on white background.
> The problem is that some parts of the document have black text on a kind of
> gray background.
> In fact this gray background is made of small black dots, like the text, and
> the OCR can get it.
>
> Is it possible to remove the background with Photoshop to have only the text
> left ?
>
> Hope my english is understandable.
>
> Thanks.
>
> Ben.
>
>

Make a mask of just the text minus the background in PS. If the image,
zoomed at 100% looks destinct (color difference between the gray and
black) enough, then use either the color range tool or if the colors are
too similar, then use the path tool to do it by hand. Once you have
selected the text that you want, click CMND or CTRL J to create a new
layer of just your selection which will be that text you want. Once
done, go back to the original layer, make sure that nothing is selected
and fill it in with pure white. Flatten the layer and try your OCR
software with that.

--

__________________________________________________
Leo McKenzie
www.solocomputerservices.com/scsgrafx.php
Ben

2004-06-02, 12:14 pm

"L. McKenzie" <customersupport@solocomputerservices.com> a écrit dans le
message de news:PWkvc.35414$zO3.23147@newsread2.news.atl.earthlink.net...
> Ben wrote:
>
of[color=darkred]
and[color=darkred]
text[color=darkred]
> Make a mask of just the text minus the background in PS. If the image,
> zoomed at 100% looks destinct (color difference between the gray and
> black) enough, then use either the color range tool or if the colors are
> too similar, then use the path tool to do it by hand. Once you have
> selected the text that you want, click CMND or CTRL J to create a new
> layer of just your selection which will be that text you want. Once
> done, go back to the original layer, make sure that nothing is selected
> and fill it in with pure white. Flatten the layer and try your OCR
> software with that.
>
> --
>


Text hasn't been written in Photoshop.
It's a scanned document.

How can I make a mak of just the test minus the background ?



Stephen H. Westin

2004-06-02, 12:14 pm

"Ben" <sem_lotr@hotmail.com> writes:

> Hi there !
>
> Could somebody help me please ?
>
> I try to do extract text for a document with FineReader.
> It works fine with black text on white background.
> The problem is that some parts of the document have black text on a kind of
> gray background.
> In fact this gray background is made of small black dots, like the text, and
> the OCR can get it.
>
> Is it possible to remove the background with Photoshop to have only the text
> left ?


What you really need is "descreening". Perhaps your scanner has this
option. If not, you can try a Gaussian blur that will turn the dots into
a bumpy gray color, then a threshold operation so that the gray background
goes to white, while the text goes to black. Of course, you will be removing
some detail from the text, perhaps enough to break the OCR.

--
-Stephen H. Westin
Any information or opinions in this message are mine: they do not
represent the position of Cornell University or any of its sponsors.
L. McKenzie

2004-06-02, 12:14 pm

Ben wrote:

> "L. McKenzie" <customersupport@solocomputerservices.com> a écrit dans le
> message de news:PWkvc.35414$zO3.23147@newsread2.news.atl.earthlink.net...
>
>
> of
>
>
> and
>
>
> text
>
>
>
> Text hasn't been written in Photoshop.
> It's a scanned document.
>
> How can I make a mak of just the test minus the background ?
>
>
>

It won't matter. It is an image.

--

__________________________________________________
Leo McKenzie
www.solocomputerservices.com/scsgrafx.php
L. McKenzie

2004-06-02, 12:14 pm

Ben wrote:

> "L. McKenzie" <customersupport@solocomputerservices.com> a écrit dans le
> message de news:PWkvc.35414$zO3.23147@newsread2.news.atl.earthlink.net...
>
>
> of
>
>
> and
>
>
> text
>
>
>
> Text hasn't been written in Photoshop.
> It's a scanned document.
>
> How can I make a mak of just the test minus the background ?
>
>
>

Ben.. Do a google search for making masks in Photoshop or use the tip
that Stephen posted. Both will work for you. Thanks.

--

__________________________________________________
Leo McKenzie
www.solocomputerservices.com/scsgrafx.php
Sponsored Links


Copyright 2003 - 2008 forum4designers.com  Software forum  Computer Hardware reviews