Re: video to SVG

On Wed, Jun 30, 2010 at 10:42 PM, John Swensen <address@hidden> wrote:

On Jun 30, 2010, at 12:47 PM, narendra sisodiya wrote:

>
>
> On Wed, Jun 30, 2010 at 7:08 PM, John Swensen <address@hidden> wrote:
>
> On Jun 30, 2010, at 9:07 AM, narendra sisodiya wrote:
>
> >
> >
> > On Wed, Jun 30, 2010 at 6:26 PM, Marc Weber <address@hidden> wrote:
> > Excerpts from narendra sisodiya's message of Wed Jun 30 14:24:58 +0200 2010:
> > > Hi, I am trying to work on a project for Video to SVG conversion.
> > > Basically, the classroom videos which has only one white-board and lines and
> > > curves can be compressed by converting into vector markup like SVG. All i
> > > need to find the curves and lines inside video/imaegs(frames)
> > >
> > > Any starting points if I use Octave to build this project. I have done some
> > > basic matrix manipulation on octave but I have not done any video or image
> > > processing using octave or any other tools
> >
> > Are you sure that Octave is the best tool for the task (maybe it is. I
> > don't know it very well). I'd just use ffmpeg (or mencoder or such) to
> > create many .jpg files from viedeo.
> >
> > video to image is not a problem.
> > I want to manipulate at pixel level in an image to detect lines and curves inside that image and their increments
> >
> >
> > Then I'd try using an existing ocr
> > solution which outputs .svg files.. (Inkscape has a built in OCR
> > command. This could be a starting point.)
> >
> > I do not think I need OCR. on whiteboard, there can be 1000 type of handwriting. All i need to convert lines and curves into their markups.
> >
>
> There are still some holes in your specification. Are you going to want to be able to take a completely filled whiteboard with no occlusions and convert it to SVG,
>
> Yes, Full video is containing whiteboard with a teacher. image of teacher will be removed by another algorithm.
>
> or are you planning on taking each frame of the video individually and converting each frame to SVG and re-concatenate to a movie of SVG files?
>
> No exactly. First video will be converted into images and them images will be differenced by previous image inorder to get the incremental SVG image. finally the SVG animation tags will be used inorder to display like video. I know this is a tough task and require lot of effort and many intermediate stages.
>
> Will the professor be standing in front of the board and occluding portions of the board and casting a shadow on the board?
>
> prof image will be deleted and whole image will be gray-scaled before the above operations.
>
>
> Did you want to take existing lectures and convert them, or will this be new lecture and you can in some manner control how they are presented?
>
> I am targeting both. Basically Poor country like India and third world countries (in terms of network bandwidth ) cannot watch videos on youtube or other format. Converting them into SVG will reduce video size to corrosponding to audio bandwidth. they will be easily streamed. In future, If SVG player got included in mobiles then student easily play and send these svg lectures via bluethooth.
>
>
> I know you probably don't want a high cost solution, but if you can constrain how the lecture are being presented, the best solution would be to have person sit at a tablet PC or WACOM tablet and talk into a microphone as they present their lecture.
>
> I have already made this solution - Demo need opera browser for animations. http://wiki.techfandu.org/eduvid/svg-edit/svgeditplay/editor/svg-editor-recorder.html SVG recorder and player. it is part of Eduvid Project.
>
> In this manner, pen strokes and pressures can easily be recorded and the SVG is easily reconstructed from the recorded information.
>
> That being said, if you really want to just convert video of a whiteboard to SVG then some pretty basic image processing can probably get you pretty far (depending on the quality of the video). If I were doing this, my first stab would be as follows:
> 1) If the camera isn't moving, crop out everything outside the actual whiteboard area. This can be accomplished by taking an initial frame with no writing and subtract this background image from each subsequent image. Make sure you normalize the brightness of each image to account for lighting changes.
>
> Basically gray scale or just white and black will solve the purpose for initial
>
> Also make sure that when the lecturer decides to erase the whiteboard that they erase it well.
>
> erasing is like a "White INK"
>
> This will have a big effect on the next step.
> 2) Perform simple thresholding to determine what is writing and what is not. This may be an exercise in "knob-turning" based on the color and quality of the markers being used. You might recommend the lecturer use only dark colored markers and be liberal in throwing away markers that are starting to dim.
> 3) Use a connected components algorithm to identify what are real writing and what are stray marks on the board.
>
> any link for connected components algorithm ?
>
> 4) Use an edge detection algorithm on each connected component to define an SVG polygon. You can be smart about decimating the points on the boundary of the polygon and even converting the connected components to strokes, but making a polygon of the boundary is definitely the easiest.
>
> Under "ideal" conditions, this should do an OK job. Whenever I have a computer vision/image processing problem, my first step is to search SIGGRAPH for an existing solution.
>
> What is SIGGRAPH and how to search their solution. (I mean other then google)
>
> If you have access to scholarly publications, you would be surprised how easy it is to find solutions to fairly common problems.
>
> John Swensen
>
>
> I am out of my university but I have access to publications so I will search.
>
> Thanks for your kind and detailed reply.
>
> --

OctaveForge already has a connected components function for binary images. It is called bwlabel. Basically you pass in a binary image (already thresholded) and it returns another image with groups of connected components labeled as {1,2,3,...,n} for each pixel that corresponds to the i-th blob of connected components. There are much more complex methods of segmentation, but I have used bwlabel in Matlab successfully in the past. I always *try* simple solutions before getting fancy. I don't know how Octave's bwlabel compares to Matlab's, however.

SIGGRAPH is the most prestigious computer vision/computer graphics conference. They are well known for accepting papers with very specific applications (and very clever solutions). Some of it is just eye-candy, but because of the very rigorous review process, papers that do make it in are usually very well written and tutorial.

I think the hardest part of this is going to be the temporal part, where you have to piece together subsequent frames to account for the moving lecturer. In a lot of my research, I have found that for endeavors such as this it makes a lot of sense to automate a lot of it, but if you can get someone to sit down and guide the procedure it will make a huge difference. What I mean by "guiding the procedure" is that maybe someone adds time markers to the video that would indicate the start and end of cohesive sequence. For example. if your algorithm doesn't have to detect when the professor is done filling the boards and is going back and erasing, it will save you from implementing a lot of smarts in your software that is hard/impossible for a computer, but my 5 year old son could easily be taught to identify.

John Swensen

Here is my first step.
I have used tested potrace and this is what I want. I have posted my results on image conversion using potrace - http://lug-iitd.posterous.com/convert-images-to-svg-format-potrace-is-excel

At the first example, I will use a screen-casting video (ideal condition where camera is not moving)

Now, I will scale down the video to 4-5 frames per second. Because in classroom video, higher fps is not need. infact for svg drawing 1fps is good.

From:	narendra sisodiya
Subject:	Re: video to SVG
Date:	Thu, 1 Jul 2010 00:41:53 +0530