Video Recognition

Vincimus · May 8, 2008

I'm trying to develop a smart car gadget that allows a navigation system to identify stop signs cars and roads via video identification. I have seen this before in a hack for halo trial. It was a program that identified colours ingame and then followed the color with the mouse giving you an auto aim system. I would like to apply this in one of my vb.net programs though i'm not sure where to start. It would work by identifying white yellow stripes along roads then processing the information and telling drivers what a road sign says or how far a car is from their vehicle. As for right now I only need information on how to accomplish getting a computer to identify parts of a video highlight them and then provide feeback.

Crusty · May 8, 2008

Straight up, you are most likely biting off a LOT more then you can chew. People spend their entire graduate school careers devoted to this stuff. Not saying it isn't possible, but if you are asking the questions you are then I have a feeling it's going to be incredibly difficult.

Vincimus · May 8, 2008

For sure but i'm looking to make a little buck for my work considering I am still in school and have plenty of time to work on it. I figure if I can get this accomplished and get the device working I could sell it.

Vincimus · May 8, 2008

no ideas? I have a live camera i can use for a test camera.

CycloWizard · May 9, 2008

As Crusty mentioned, there are entire research journals dedicated to computer vision (such as the Journal of Computer Vision ). This field is an entire science unto itself. The fundamentals are pretty straightforward, but extracting features from a video stream in real time is not. That said, I can give you some pointers because I spent a large amount of my graduate career learning the practical aspects of this stuff.

For simplicity, I'll consider grayscale only since it is far easier to use grayscale for this sort of thing, and it's trivial to convert color images to grayscale. Each digital image is stored as an array of data where each pixel is a value in the array. If your image is 8 bit grayscale, that implies that there are 2^8=256 possible values for each spot in the array. The value depends on the intensity of light there (i.e. how white or black the pixel is).

Once you have an array of image intensity data, feature detection is basically accomplished by finding patterns in the data that are somehow related to the feature you're looking for. This is often accomplished using edge-finding filters and/or morphological operators. Once you know the location of the edges in the image, you can see if any of them matches the description of what you're looking for.

For example, if I know I'm looking for a stop sign, I could look for any continuous eight-sided object in the image. To do this, I might apply a Canny filter (one type of edge-finding filter that is very common) to the image. This result will be somewhat noisy and may have some holes in it, so then I might apply a morphological closing operator to fill the holes to ensure continuity of the edges. Then, I can see if there are eight contiguous edges anywhere in the image and voila, I have my stop sign.

This sort of heuristic approach sounds simple but can take years to implement in practice due to the number of parameters that need to be varied to optimize the various algorithms for your particular application, camera gear, lighting conditions, etc... Further, the comparison (i.e. "Does this object have eight sides?") becomes a real problem in most cases, since perspective and geometry are important factors.

Vincimus · May 9, 2008

Okay thanks i'm going to go out to the road take a few pictures and work on it. Second qeustion which type of sensor is best for detecting distance between objects?

Cogman · May 9, 2008

Originally posted by: Vincimus

I'm trying to develop a smart car gadget that allows a navigation system to identify stop signs cars and roads via video identification. I have seen this before in a hack for halo trial. It was a program that identified colours ingame and then followed the color with the mouse giving you an auto aim system. I would like to apply this in one of my vb.net programs though i'm not sure where to start. It would work by identifying white yellow stripes along roads then processing the information and telling drivers what a road sign says or how far a car is from their vehicle. As for right now I only need information on how to accomplish getting a computer to identify parts of a video highlight them and then provide feeback.

VB.net, Halo hack, yeah, I don't think you quite grasp the complexity of this situation.

For the most part, colors in halo are static and very well defined, in real life, they aren't. For example, you face your camera at the sun then at the road, you will see the lens adjust and everywhere will appear to get brighter Do you know how many unique colors you would have cycled through at any given point?

Not to mention noise, halo doesn't have any, real life video cameras receive TONS of noise. That means in a very controlled environment with any given camera focused on one point will give you a point that gets lighter, darker, slightly redder. ect. every other frame or so. And noise filtering is VERY complex

Heck, and then you bring up actually reading the sign, converting an image to text is also fairly complex (look up ORC's).

The one thing that is doable without a large researching team would be getting the distance of a car in front of you (directly in front of you) and that has already been done on some luxury cars. The way to do that is probably with a laser and a very precise timer.

You won't be able to make some handheld device, it is going to be a full blown quad core (heck possibly even an octocore) computer that is required to handle the data needed. If you can get this done with vb and a web cam, then you have just solved some of the most complex problems facing making a car that can drive its-self.

Again, I can't stress this enough, We are talking about having a LOT of processing power, this "gadget" is not going to run very cheap ($1000, 600 pounds)

Vincimus · May 9, 2008

What i'm saying is I know there are applications used to distinguish edges and colours. I'm totally conceiving the size of this project but it's is something I'm willing to be commited too. I'm using the halo hack as an example because it's is the only instance in which i've seen this type of application used. I'm willing to devote years of my life to this project I'm not sure where to start. A friend recommended me to this site. As for right now I'm not building the device i'm only asking to create a program that can recognize certain areas in a video. This is to me how I think I should start the process.

Crusty · May 9, 2008

There's a point where you have to realize how unrealistic this project is. Companies who have put thousands of man hours and millions of dollars into R&D for this kind of technology still can't do this kind of processing in real time.

My best advice is to sign up for some image processing classes, and then either get a research job or internship with somebody who is doing this kind of work. Trying to do this lone ranger style is not a bright idea.

Cogman · May 9, 2008

Originally posted by: Vincimus
What i'm saying is I know there are applications used to distinguish edges and colours. I'm totally conceiving the size of this project but it's is something I'm willing to be commited too. I'm using the halo hack as an example because it's is the only instance in which i've seen this type of application used. I'm willing to devote years of my life to this project I'm not sure where to start. A friend recommended me to this site. As for right now I'm not building the device i'm only asking to create a program that can recognize certain areas in a video. This is to me how I think I should start the process.

Ah, but 99% of those applications work on static images or computer program images, not on video streams. The stuff that works on video streams is either propitiatory, or unreleased. There is no free rides out there.

If you really want to do this then there are a few things I would recommend.

1. Ditch VB.Net, it is going to be way too slow for what you are trying to do, your best bet is adopting C or C++ (but C would be better as it is a bit faster)

2. Join MIT, Oxford, or some school known for its innovations and engineering feats, this isn't a project you can do on your own in your spare time, you really need a team for this.

3. find books, lectures, anything on video processing (heck, even try and read some AVISynth plugins or the x264 code to get a feel for how people treat video). You really need to get the theory down.

Might I just stress again, you may be committed to this, but you simply won't be able to do this on your own with just a video camera and you home PC. You are going to need a lot of help on this.

Something you might even try is to try implementing a simple color finder in your favorite game, then try and implement a shape finder/text reader. Start with something that should be simpler and try and work up, I wouldn't dive right into full blown video processing just yet, rather try and work on a smaller less complex situation.

You might try looking for resources from the darpa challenge and see what the teams there are trying to do.

http://en.wikipedia.org/wiki/DARPA_Grand_Challenge

Vincimus · May 9, 2008

No no I do mean to start small. I plan on as for right now merely setting up a program that detects and labels shapes in a still image.

My question for right now is using vb.net how do I accomplish registering shapes of a still image and being able to label what these shapes are.

CycloWizard · May 9, 2008

Originally posted by: Cogman
Ah, but 99% of those applications work on static images or computer program images, not on video streams. The stuff that works on video streams is either propitiatory, or unreleased. There is no free rides out there.

The same methods I stated above apply to video, since video is simply a sequence of 2-D images. You can actually use some cool methods to improve the recognition abilities in video over 2-D by interpolating between images and such.

Vincimus: if you have MATLAB and the image analysis toolbox, I wrote a very extensive program that finds and labels specific edges in an image (with the help of the now-banned Homercles337). I don't have time to upload them now since I'm literally walking out the door for a weekend excursion, but I'll check this thread when I get back.

Vincimus · May 9, 2008

Okay thanks

PolymerTim · May 9, 2008

Hehe, it looks like you're getting a rough welcome Vincimus. Let me start by saying "Welcome to Anandtech!"

What you're trying to accomplish may not be feasible, but I see no reason to let that stop you from trying. If you get even 10% of the way to your final goal I still think you will learn quite a bit. As long as the education is your true goal and the software you're writing is the means to that end, I think it could be a valuable use of your time.

I'm not a programmer by most people's standards, but I do dabble in a few languages and I've recently worked with some image analysis. It wasn't real time, but as CycloWizard mentioned, videos are just a simple series of still images anyway (as long as you're not converting from analogue video and dealing with interlacing etc, but that can be dealt with as well), so, I think you're right to start with still image analysis and then you can experiment with make it as efficient as possible.

I worked in a programming language called LabView. If you haven't heard of it, it is a graphical programming language that is specifically designed to interface with instruments. Over the years, it has developed into a pretty extensive general programming language as well, but its roots are in acquisition and control. As you can imagine, "machine vision" is one of their specialties. From what I hear, it is growing in popularity with industry for use in things like motion control and plant operations, so if your a programmer, it may be a good program to be familiar with. My university (Case Western Reserve) had a car in the DARPA challenge that used LabView for sensors and feedback.

Here is a link to LabView's manual for "vision concepts". It is specific to LabView, but has a lot of good general image analysis information similar to the edge detection and morphological transformations CycloWizard referred to. If you're wanting to know more about image analysis in general, I found it to be pretty helpful for me starting with very little knowledge in either image analysis or programming.

http://digital.ni.com/manuals....F7701F8625731500701D23

LabView did a lot of the hard work for my program. I was analyzing the movement of particles with time and wrote a program that:

1) Identifies all 200 or so particles in each of about 100 frames that I collected from a camera attached to my microscope. The x,y location of these particles is then written to a text file, one file per image processed.

2) Analyzes the locations with time to match particles between frames and outputs a new text file for each particle detailing its changing location with time.

With ~200 particles and 100 frames, my computer (P4-2.8GHz) takes about 3 seconds for step 1 and 17 seconds for step 2. As you can see, the simple image analysis occurs at essentially real-time speeds (30 Hz on a full desktop), but my more elaborate analysis is much slower.

Another resource you may want to consider is the anandtech programming forum. Might be helpful for you.
http://forums.anandtech.com/ca...tid=70&flcache=2884092

-Tim

Edit: I really should remember to spell check.

AeroEngy · May 12, 2008

Just thought that I would throw in my 2 cents. This is s a difficult area of research but if you are doing this as a learning exercise and willing to put in lots of time then it will be worth while.

I friend of mine was working as something similar for aircraft. It essentially used the marker lights on the back of an aerial fuel tanker as part of a control system to guide an aircraft in to be refueled. By judging the distance between the tail marker lights and their orientation to each other the relative position of the tanker with respect to the aircraft could be determined and used as part of the control system (along with the aircraft's own state vectors from GPS, inertial NAV, Airspeed sensor, etc. and a detailed model of the aircraft dynamics).

The video recognition part was a relatively small part of his project compared to the overall control system. However, He used a lot of Matlab's built in image processing toolboxes for the majority of the work. I imagine developing those algorithms on your own in C would be very difficult to say the least.

Good luck and I found this simple video as a demo on Mathwork's website for detecting cars moving on a highway from a fixed camera. The demo is very simplified but Enjoy

See link at the bottom of the page for the video.

Vincimus · May 13, 2008

Okay thanks guys I just got busy the last week so I couldn't reply. But i'm working on it again! Thanks for the links!

Cuw · May 29, 2008

I'd highly recommend getting familiar with basic digital signal processing because it will be really helpful for this. I have only dabbled in image processing but from my understanding there is a lot of filtering done on images to find relevant data and if you don't understand how a one dimensional filter works grasping how a two dimensional filter works will be complicated.

Also MATLAB is a great program to work with for this and is something you should definitely learn. It is relatively quick and easy to write a script for which is really nice.

tank171 · Jun 6, 2008

And to think, this is super complicated and really really hard for a computer to do, where as our brains can do it easily in a fraction of a second.

Video Recognition

Vincimus

Junior Member

Crusty

Lifer

Vincimus

Junior Member

Vincimus

Junior Member

CycloWizard

Lifer

Vincimus

Junior Member

Cogman

Lifer

Vincimus

Junior Member

Crusty

Lifer

Cogman

Lifer

Vincimus

Junior Member

CycloWizard

Lifer

Vincimus

Junior Member

PolymerTim

Senior member

AeroEngy

Senior member

Vincimus

Junior Member

Cuw

Junior Member

tank171

Member

TRENDING THREADS