This is not a query on music learning. However, those who have been involved in translation/transcription of of music notes may have experience in this sort of thing and be able to assist.
A document was created in Tamil and edited by a bunch of us using Google docs. It was downloaded as .pdf using Google's pdf conversion function. Now the online editable Google doc version has been lost (the person who had originally created and shared the Google doc deleted their account) and with it all the changes made following the last download.
We now want to be able to make edits to the content, that now only exists in the .pdf version. None of the standard pdf to Word converters work. Google gives you the option to upload a .pdf and convert it into an editable Google doc, but Tamil is not one of the languages it supports.
Would anyone have ideas to share? If you think it is not of general-enough interest please email me separately at LRamakrishnan.lists@gmail.com
thanks a lot,
vainika/LRamakrishnan
converting Tamil .pdf to .rtf or .doc
-
- Posts: 2807
- Joined: 03 Feb 2010, 16:52
Re: converting Tamil .pdf to .rtf or .doc
Have you tried converting it with Adobe Acrobat (full version)?
-
- Posts: 433
- Joined: 03 Feb 2010, 11:32
Re: converting Tamil .pdf to .rtf or .doc
Yes, Mohan. The .pdf file has embedded the fonts, so there isn't any chance that the text can be extracted using Acrobat. Acrobat does not have support for Tamil. As of now the only option (one that was suggested today) appears to be printing the document and scanning it into a machine that has Tamil fonts and Unicode support.mohan wrote:Have you tried converting it with Adobe Acrobat (full version)?
-
- Posts: 10956
- Joined: 03 Feb 2010, 00:01
Re: converting Tamil .pdf to .rtf or .doc
Ramakrishnan: Open it with Adobe Acrobat, and look at 'Properties' under 'File'.
Under the 'Fonts' tab in Acrobat, you should see that all of your fonts have been embedded (e.g. Embedded Subset).
May be that will give you some ideas on what fonts to bring in to Word.
Also, found this thread from Adobe forum which also talks about the OCR method you mention: http://forums.adobe.com/thread/427945
If I think of anything else, I will post. I am sure you have asked google docs support people and forums for help. If not, that is another avenue. They should at least provide a way to upload the PDF to an editable format. ( we sure can wish, can't we!! )
Your experience illustrates a problem with this collaborative editing and cloud syncing which we have to make adequate 'backup' measures of. It seems the usual notion of 'Cloud equals no backup needed' is a false notion in these circumstances.
Under the 'Fonts' tab in Acrobat, you should see that all of your fonts have been embedded (e.g. Embedded Subset).
May be that will give you some ideas on what fonts to bring in to Word.
Also, found this thread from Adobe forum which also talks about the OCR method you mention: http://forums.adobe.com/thread/427945
If I think of anything else, I will post. I am sure you have asked google docs support people and forums for help. If not, that is another avenue. They should at least provide a way to upload the PDF to an editable format. ( we sure can wish, can't we!! )
Your experience illustrates a problem with this collaborative editing and cloud syncing which we have to make adequate 'backup' measures of. It seems the usual notion of 'Cloud equals no backup needed' is a false notion in these circumstances.
-
- Posts: 225
- Joined: 14 Sep 2008, 01:15
Re: converting Tamil .pdf to .rtf or .doc
I once tried to do this for Telugu, but gave up. The technical problem is described at http://blogs.adobe.com/insidepdf/2008/0 ... files.html but in simple terms, it's unlikely to work ..
- Sreenadh
- Sreenadh