Big Data – Garbage in, garbage out?

Change of plan for this post…I visited the dentist recently. And before the consultation, I was handed an ipad with a form to complete. I was sure I had completed this form before last time – and checking with the receptionist she said it had to be completed every six months. So I had completed it before. It was a long form asking all sorts of details about medical history, medicines being taken etc. It included questions about lifestyle – how much exercise you get, whether you smoke, how much alcohol you drink etc. It all seemed rather over the top to be completing every six months. It seemed such an inefficient process and prone to error. Every patient completing all these detailed questions (often in a rush). And no way to check what my previous answers were – wouldn’t it be nice if they just pre-filled my previous answers and I could make any adjustments. All a little frustrating really. So I asked the receptionist why all this was needed.

“The government needs it,” was the reply. Really? What on earth do they do with it all, I wondered? I have to admit, that answer made me try a little experiment. I tried to see if the form would submit without me entering anything. It didn’t – it told me I had to sign the form first. So I signed it and sure enough it was accepted. So I handed the ipad back to the receptionist and she thanked me for being so quick. Off I went to my appointment and all was fine. And I felt as though I had struck a very small blow for freedom.

I wonder what does happen to all the data. Does it really go to “the government”? What would they do with it? Is it a case of gathering big data that can then be mined for trends – how the various factors affect dental health maybe? Well, one thing’s for sure, I wouldn’t trust the conclusions given how easy it seems to be to dupe the system. What guarantee is there on the accuracy of any of the data? Seems to me a case of garbage in, garbage out.

As we are all wowed by what Big Data can do and the incredible neural networks and algorithms teams can develop to help us (see previous blog), we do need to think about the source of the Big Data. Where has it come from? Could it be biased (almost certainly)? And in what way? How can we guard against the impact of that bias? There’s been a lot in the news recently about the dangers of bias – for example in Time and the Guardian. If we’re not careful, we can build bias into the algorithms and just continue with the discrimination we already have. Our best defence is scepticism. Just as when, in root cause analysis, an expert is quoted for evidence. As Edward Hodnett says: “Be sceptical of assertions of fact that start, ‘J. Irving Allerdyce, the tax expert, says…’ There are at least ten ways in which these facts may not be valid. (1) Allerdyce may not have made the statement at all. (2) He may have made an error. (3) He may be misquoted. (4) He may have been quoted only in part….”

Being sceptical and asking questions can help us avoid erroneous conclusions. Ask questions like: “how do you know that?”, “do we have evidence for that?” and “could there be bias here?”

Big Data has huge potential. But let’s not be wowed by it so that we don’t question. Be sceptical. Remember, it could be another case of garbage in, garbage out.

Image: Pixabay

Text: © 2017 Dorricott MPI Ltd. All rights reserved.

Where’s My Luggage?

On a recent flight, I had a transfer in Dublin. My arriving flight was delayed as there weren’t enough available stands at the airport. I made it to my connecting flight but evidently my hold luggage did not. Have you ever been there? Stood by the baggage reclaim watching the bags come out. Slowly, they are collected by their owners who disappear off and you are left to watch the one or two unclaimed bags go round and round and yours is not there? Not great.

The process of finding my luggage and delivering it home the next day was actually all pretty efficient. I filled in a form, my details were entered in the system and then I got regular updates via email and text on what was happening. The delivery company called me 30 minutes before arriving at my house to check I was in. But it was still frustrating not having my luggage for 24 hours. It got me thinking…

How often does this happen? Apparently, on average, less than 1% of bags are lost. Although given the number of bags, that’s still a lot and explains why the process of locating and delivering them seems to be well refined with specific systems to track and communicate. But what is the risk on specific journeys and transfers? When I booked the flight, the airline had recommended the relatively short transfer time in Dublin. My guess is that luggage missing the connecting flight on the schedule I was on is not that unusual – it only needs a delay of 30 minutes or more and it seems your luggage is likely to miss the transfer. A 30 minute delay is not unusual as we all know.

This is a process failure and it has a direct cost. The cost of the administration (forms, personnel entering data into a system, help line, labelling), IT (a specific IT system with customer access), transport (from the airport to my home). I would guess at US$200 minimum. This must easily wipe out the profit on the sale of my ticket (cost US$600). So this gives some idea of the frequency – it cannot be so high as to negate all the profit from selling tickets. It must be a cost-benefit analysis by the airline. Perhaps luggage missing this particular connecting flight occurs 5% of the time and they accept the direct cost. But the benefit is that the shorter transfer time is preferred by customers and makes the overall travel time less. So far so good.

But, what about the cost of the 24 hours I had without my luggage? That’s not factored into the cost-benefit I’m sure because it’s not a cost the airline can quantify. Is my frustration enough to make me decide not to fly with that airline again? I have heard of someone recently whose holiday was completely messed up due to delayed luggage. They had travelled to a country planning to hire a car and drive to a neighbouring country the next day. But the airline said they could only deliver the delayed luggage within the country of arrival. And it would take 48 hours. Direct cost to the airline was fairly small but the impact to the customer was significant.

So how about this for an idea. We’re in the information age and the data on delayed luggage must already be captured. When I go to book a flight with a short transfer time in future, I’d like to know the likelihood (based on past data) of my luggage not making the transfer. Instead of the airline being the only one to carry out the cost-benefit, I want in on the decision too – but based on data. If the risk looks small then I might decide to take it. As we all have our own tolerance for risk, we might make different decisions. But at least we are more in control that way rather than leaving it all to the airline. That would be empowerment.

We can’t ensure everything always goes right. But we can use past performance to estimate risk and take our own decisions accordingly.

 

Photo : Kenneth Lu  license

Text: © 2017 Dorricott MPI Ltd. All rights reserved.

Artificial Intelligence – The End of the Human Race?

Yesterday, I attended a New Scientist Instant Expert Event on Artificial Intelligence (AI) in London. It was packed. And had some fascinating speakers. It started with the excitement made in the press by Stephen Hawking’s claim in 2014 that “The development of full artificial intelligence could spell the end of the human race.” Could things really be that bad?

This is a field moving so fast that society is really not keeping up with the ethical and privacy questions that it raises. Lilian Edwards (University of Strathclyde) gave a fascinating talk on the legal issues around AI. She said that although there are many questions raised, our laws are mostly good enough at the moment. A robot / AI is not a legal personality like a human or a limited company. It should be thought of as a tool – and we’ve had industrial robots around for years that fit without problem into the existing framework for Health and Safety, for example. One question that is very relevant today is what if an autonomous vehicle injures someone – who is to blame? Is it the driver, manufacturer, algorithm creator, training set, 3rd party hacker etc.? But actually, we have a similar situation at the moment where, when there is an accident, there is the potential of different parties taking part of the blame. Whether driver, manufacturer, mechanic or road maintenance. So this is not a new type of issue in law. We solve it currently with compulsory insurance and insurance companies fight it out. Of course, that doesn’t fix the injury though.

Another interesting area that was explored was privacy with the use of AI on Big Data. There was the example of a girl who was found to be pregnant by an algorithm at Target when the father was not aware – and then the father received lots of vouchers for baby products. Or smart meters for electricity being able to detect usage when someone is upstairs in their house even though they are receiving benefits for being disabled and unable to go upstairs. We should all be asking where our data is going and how it is going to be used.

Irina Higgins from DeepMind (Google) talked about the astonishing achievement of AlphaGo beating the world champion, Lee Sedol, at the game ‘Go’. She and Simon Lucas from the University of Essex talked about why AI researchers focus on games so much. Games are designed to be interesting to humans and so presumably encapsulate something of what it is to be human. They can also be set up and played many times without any safety concerns (which you might get from a robot flailing around in a lab). There was a great example of use of AI software to tackle games it had not seen before – and after playing them many times, working out strategies to be ‘super-human’. Including coming up with strategies that no human had come up with before to win the game. Irina also shared how the AlphaGo AI had been used to reduce the energy consumption of Google’s cooling system by 40%. Human intelligence is difficult to define and one of the attributes of human intelligence is its ability to be general – to tackle not just one task well but completely unrelated ones too. AI and robots can be super-human in specific areas but it will be a long time before they have this general intelligence. But AI can really assist humans to be faster and smarter in a world with too much information and great system complexity. We should see it as a tool.

Kerstin Dautenhahn from the University of Hertfordshire talked about the use of robots and AI to help people – whether they are infirm and living at home or with autism. With her background in biology Kerstin brought an interesting slant to the discussion.

The final session was a debate where the audience had submitted questions and it was a lively affair. Example big questions – can a robot be truly conscious? Does so much funding of robotics by the military taint the field? Should there be a tax on companies that rely on AI for their profits? Should sex robots be allowed?

The final question at the end of the debate was very revealing. The five panellists were asked whether they believed we would see AI equivalent to human intelligence in the next 70 years. Three gave unequivocal no’s, one a qualified no and one an unequivocal yes. So whilst AI is a very fast moving field, it seems that on balance the experts (on this panel at least) think human level intelligence is a long, long way off. My takeaway is that AI offers huge promise for mankind – we should not view it as a coming apocalypse that will end our race. But we do need more debate like this. We need to be discussing as a society the big questions and the ethics so that we can minimise the unintended consequences of this fantastic opportunity.

 

Text © 2017 Dorricott MPI Ltd. All rights reserved.