Managing Data For Better Fraud Detection
3 comment(s)
In previous posts, we have discussed data in terms of its value and its characteristics. In those posts we have touched on data management, but I'd like to focus on it more. Mike Braatz characterized data as the bedrock of "enterprise software in general, and fraud prevention in particular. If data is the bedrock, then data management is the foundation.
Mike also pointed out that every detection technique relies on good data and the more complete and contextual the data the more techniques can be fully supported. The responses to the posts were interesting as well, making points like, "There now needs to be a marriage of out of the box solutions with a flexible framework to allow you to adjust to the environment. Which very succinctly recognizes the yin and yang of data management; getting the data is only half the battle, you've got to be able to efficiently access it in a way that allows you to do something with it. Another response makes a similar point, "Data does matter indeed. Size matters too! Even if you have the Data, if you are unable to scale dynamically, you're no better off. In this case the author brings up the point of volume, and by my reading, its impact on the ability to use the data.
Many data management solutions I've seen try to solve the volume problem by being selective about the data they take from the original source or modifying that data in some way like summarizing, aggregating, and other methods that change it from its original form.
The limitation with this approach is that you lose something. You lose either the data itself because you didn't take it all, or you lose part of the character of the data like relationships between elements and records that only come to light in the original form. So there's the conundrum. (Don't you love conundrums? Life would be pretty dull without them.) How do I get everything I want and still be able to manage and use it efficiently?
We might learn something from the Internet here. On the internet, data is stored literally all over the world in a variety of formats, and yet we can find it and access it in a matter of seconds. How do they do that? Now I'm not a technician, and those of you who are will probably roll your eyes at this next statement, but the way I look at it traditional database technology is all about storing data, and search engine technology is all about retrieving it wherever and however it is stored. If I had lots and lots of data, I'd put my money on the second horse. Store it simply and economically, and retrieve it quickly and efficiently.
If you've got two cents to spare, please give it to me. One of the objectives of this blog is to stimulate discussion. If I had a nickel for every two cents worth I've seen here (save our own), I still wouldn't be able to buy my favorite frappachino at Starbucks. So if you have a thought (even if you think it's mundane, someone else might have an epiphany from it) please chip in. Sharing is caring.