• When you click on links to various merchants on this site and make a purchase, this can result in this site earning a commission. Affiliate programs and affiliations include, but are not limited to, the eBay Partner Network.

Need some help from the forums for an experiment in Machine Learning for grading comics.
2 2

74 posts in this topic

11 hours ago, bronze_rules said:

Right. But I'm thinking of the centerfold missing, being more of an outlier. So, you read in the raw image, create a flat vector of the 2D image and train on grade target. Each image has to be perfectly aligned or have an algo to force it, along with presumably same consistent method of capturing image (this is another reason why I think cgc db was so good). Once the image has been scanned and grader has outcome, they do a preliminary check of other features, centerfold present? y. page missing n. Ok. grade verified. But, again I would think these cases would be outliers. The idea is to get volume pre-scans for the majority to reduce workload, provide objective rules, and speed up tasks. We could also add features on top of the image vectors. e.g. F1) number ticks 3 F2) longest crease etc.  But these are adding more work to the data preparer. I truly think the raw image is enough to get close to grader results on average with enough data and consistent scan methodology.

I'm not sure why we are assuming that CGC takes detailed scans of books. Is there somewhere that they indicate this is part of their process?

Regarding the interior defects, my assumption was the same as yours. These are outliers. Any cursory examination will show missing or damaged interior content of a book. If something in that regard is damaged then that usually becomes the major factor in downgrading the book. These books are usually excluded from grading by default unless they are much older and rarer books and those aren't really the ones that get graded in volume.  

The front and back cover details are what define the grade for comics being sent through in volume.

11 hours ago, bronze_rules said:

Any deviation in image placement could kill the learner (spine miswrap for example) and has to be accounted for.

Is that the case for a general learner? A spine wrap issue generally moves the book into a different scoring category which you would want to account for in the model, but at the same time some books just come with the spine wrapped differently from different eras.

Or are you referring to placement of the book in the image, ie a book that is placed incorrectly could be improperly classified?

 

Link to comment
Share on other sites

Have you looked at https://comicbookgradingtool.com/?  It's not a very sophisticated tool, but it does go over a lot of the things that you would want to train your model to look for.  It might make more sense to use the model to do a simple binary test (does the comic have spine stresses yes/no).  Then have it do a degrees type of thing (what level of spine stresses does a comic have from none to excessive).  Finally develop an algorithm that looks at all the things mentioned on the online tool's page and try to figure out a grade from them.

Link to comment
Share on other sites

2 hours ago, RhialtoTheMarvellous said:

I'm not sure why we are assuming that CGC takes detailed scans of books. Is there somewhere that they indicate this is part of their process?

Regarding the interior defects, my assumption was the same as yours. These are outliers. Any cursory examination will show missing or damaged interior content of a book. If something in that regard is damaged then that usually becomes the major factor in downgrading the book. These books are usually excluded from grading by default unless they are much older and rarer books and those aren't really the ones that get graded in volume.  

The front and back cover details are what define the grade for comics being sent through in volume.

Is that the case for a general learner? A spine wrap issue generally moves the book into a different scoring category which you would want to account for in the model, but at the same time some books just come with the spine wrapped differently from different eras.

Or are you referring to placement of the book in the image, ie a book that is placed incorrectly could be improperly classified?

 

You are right. I am not certain if they take detailed scans. However, I'd be really surprised if they didn't. For one, it's a safeguard against complaints someone did something to damage a book. Two it's a record of historical comparisons that a grader could look at to help them arrive at a decision. I guess a grader could come in and verify. 

What I was saying is that if the images themselves are not aligned (think of one image overlaid on another) well, I don't think the deep learner would work all that well on the same comic . A spine wrap would present an image offset and present a huge difference feature representation to the learner, same for other orientation offsets. In cats and dogs, what differentiates them well is properties like color(s) and shape. Those properties are mostly consistent regardless of orientation difference. 

I would try to get the images aligned as closely as possible to start running experiments (or have that be part of the algo pre-prep -- crop.. align). Then move on to less stringent orientations once the learner shows it's working.

All that being said, I really like the other suggestions of just starting out by presenting categorical features from grader notes or visual inspection and running the learner from that data first. I was thinking last night that you could generate surrogate data for a learner, just by taking some of overstreet grading rules as categorical features, bootstrapping into the thousands, and trying the learner out on new comics with the features manually entered.

Edited by bronze_rules
Link to comment
Share on other sites

I'm only now kind of getting a sense of the differences between machine learning and deep learning and understanding why deep learning requires a lot of data to become functional.

I'm also realizing why you folks with more experience are suggesting training a dataset against much less broad criteria before going down this road.

The reason I'm realizing all of this is because I actually went in and made a program (rather than the model builder tool I was using prior) to utilize tensorflow and when I got into the nitty gritty of it I saw the basics of image recognition in the consumer area are all oriented around transfer learning. The existing deep learning model you choose has already been pre-trained against millions of images that it can classify into thousands of different categories and you build out a model that uses a subset of that data.

This obviously isn't going to really work to score a book under the standard comic rating system when the criteria for classification under the various scoring levels is not even known by the existing model.

Hmmm, interesting. This does give me some impetus to start breaking down things into those smaller classes.

Edited by RhialtoTheMarvellous
Link to comment
Share on other sites

3 hours ago, thunsicker said:

Have you looked at https://comicbookgradingtool.com/?  It's not a very sophisticated tool, but it does go over a lot of the things that you would want to train your model to look for.  It might make more sense to use the model to do a simple binary test (does the comic have spine stresses yes/no).  Then have it do a degrees type of thing (what level of spine stresses does a comic have from none to excessive).  Finally develop an algorithm that looks at all the things mentioned on the online tool's page and try to figure out a grade from them.

Yes, I've seen that, it's a good overall tool and way to break down the grading system and your point is well taken. The classifications done on that page, if regarded as accurate, would be what we want to automate. Then the problem just becomes finding enough examples of each to train the machine on.

Link to comment
Share on other sites

 

28 minutes ago, onlyweaknesskryptonite said:

CGC sent out an email talking about their new expansion.  Interesting in this email is one section I have put an arrow to it. hm20210304_164512.jpg.3739b6d526e16dc13744625838f0b9a8.jpg

Probably using software for grading coins.  Coins will be automated first.

Then currency, stamps and cards.  Comics last.

and then ...

skynet.thumb.jpg.14801b2227cb47f2d4c37e0b6436139d.jpg

Edited by vheflin
Link to comment
Share on other sites

3 hours ago, vheflin said:

 

Probably using software for grading coins.  Coins will be automated first.

Then currency, stamps and cards.  Comics last.

and then ...

skynet.thumb.jpg.14801b2227cb47f2d4c37e0b6436139d.jpg

tumblr_me14djypG21qfr6udo1_500.gif.d45ccad730a782ae5c8266ae94768f94.gif

1 hour ago, vheflin said:

They'll have roboslabbers before robograders so guess who will be handling your prized collectibles ...

bender.jpg.039e230a6354ba3e27c02af63f989ec5.jpg

 

55543582_tenor(3).gif.6c6b84f54a6826c6ac7dfeaea6891b20.gif

Link to comment
Share on other sites

I was actually thinking about this the other day. It wouldn't be definitive, of course, there would need to be adjustments, but if the program says it is a VF then VF - to VF + is probably a decent comfort zone unless it obviously missed something. Dunno about 9.8s though.

Link to comment
Share on other sites

yeah, the more i think about it the more impossible this really is without scanners that map the topography of the cover too so that bends and dips can be accounted for that are not visible in a scan

Link to comment
Share on other sites

4 hours ago, the blob said:

yeah, the more i think about it the more impossible this really is without scanners that map the topography of the cover too so that bends and dips can be accounted for that are not visible in a scan

Eventually somebody will figure out a mechanized system that will be able to assign an accurate grade based on the parameters set. In other words, a graded comic could be graded again on any random day, and the grade would remain the same every time. That will eliminate the human aspect and make all the human-graded slabs obsolete. There's too much money involved in these funny books for a person's opinion to be the final word in valuing the mega keys. Think of all the books being sent in for regrading when it happens.

Link to comment
Share on other sites

Most tensorflow CV models using cnns start by blurring the images until only prominent features are created as it passes through the layers.  I've thought about this problem quite a bit and don't think that deep learning models would really work for grading.  I think what other people have mentioned actually makes more sense... where maybe you look at it from a more simplistic point of view.  What i've been thinking about doing is to use something similar to crack detection as a basis for spine folds, and then finding the distribution for the various grades.  Then doing edge detection to calculate how sharp/not sharp corners are, then another model for determining some of the other things listed on https://www.comicbookgradingtool.com/.  Then you can "sum" up the different scores or build something like a linear regression model to make a final score prediction based on the summation of other scores.  

Link to comment
Share on other sites

On 3/4/2021 at 3:46 PM, onlyweaknesskryptonite said:

CGC sent out an email talking about their new expansion.  Interesting in this email is one section I have put an arrow to it. hm20210304_164512.jpg.3739b6d526e16dc13744625838f0b9a8.jpg

"And now a live peek into the CGC mailroom..."

LinearFocusedBarnowl-size_restricted 2.37.18 PM.gif

Edited by MatterEaterLad
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
2 2