#Bitcoin is the Sewer Rat of Currencies
Things auto-tagged ‘Jungle Gym’ on Flickr
Flickr has introduced auto-tagging, aided by Machine Learning (I checked that it is with ML and found this Yahoo machine learning presentation). The user response has been quite negative so far, this Flickr forum post has a lot of angry pro users having to correct thousands of photographs for inexact tagging. Flickr openly say they want people to correct their tags because that will further help train their ML algorithms.
Alex Hern of the Guardian wrote about some contentious cases such as when people have been auto-tagged ‘ape’ and when concentration camps get tagged ‘sport’ and ‘jungle gym’. In isolation these cases seem really outrageous so I did a search for ‘jungle gym’ and found many false positives, painting a much more systemic problem; it seems Flickr’s strategy is to auto-tag as much as possible, forcing their users, often not bothered about tags, to respond by curating a better set of tags for each image. So the bigger strategy seems to pitch machine learning against human labour in an attempt to make their algos smarter and their image service perfectly tagged.
https://github.com/yahoo/samoa
Machine learning and data mining are well established techniques in the world of IT and especially among web companies and startups. Spam detection, personalization and recommendations are just a few of the applications made possible by mining the huge quantity of data available nowadays. However, “big data” is not only about Volume, but also about Velocity (and Variety, 3V of big data).
The usual pipeline for modeling data (what “data scientists” do) involves taking a sample from production data, cleaning and preprocessing it to make it usable, training a model for the task at hand and finally deploying it to production. The final output of this process is a pipeline that needs to run periodically (and be maintained) in order to keep the model up to date. Hadoop and its ecosystem (e.g., Mahout) have proven to be an extremely successful platform to support this process at web scale.
However, no solution is perfect and big data is “data whose characteristics forces us to look beyond the traditional methods that are prevalent at the time”. The current challenge is to move towards analyzing data as soon as it arrives into the system, nearly in real-time.
For example, models for mail spam detection get outdated with time and need to be retrained with new data. New data (i.e., spam reports) comes in continuously and the model starts being outdated the moment it is deployed: all the new data is sitting without creating any value until the next model update. On the contrary, incorporating new data as soon as it arrives is what the “Velocity” in big data is about. In this case, Hadoop is not the ideal tool to cope with streams of fast changing data.
Distributed stream processing engines are emerging as the platform of choice to handle this use case. Examples of these platforms are Storm, S4, and recently Samza. These platforms join the scalability of distributed processing with the fast response of stream processing. Yahoo has already adopted Storm as a key technology for low-latency big data processing.
Alas, currently there is no common solution for mining big data streams, that is, for doing machine learning on streams on a distributed environment.
SAMOA (Scalable Advanced Massive Online Analysis) is a framework for mining big data streams. As most of the big data ecosystem, it is written in Java. It features a pluggable architecture that allows it to run on several distributed stream processing engines such as Storm and S4. SAMOA includes distributed algorithms for the most common machine learning tasks such as classification and clustering. For a simple analogy, you can think of SAMOA as Mahout for streaming.
SAMOA is both a platform and a library. As a platform, it allows the algorithm developer to abstract from the underlying execution engine, and therefore reuse their code to run on different engines. It also allows to easily write plug-in modules to port SAMOA to different execution engines.
As a library, SAMOA contains state-of-the-art implementations of algorithms for distributed machine learning on streams. The first alpha release allows classification and clustering.
For classification, we implemented a Vertical Hoeffding Tree (VHT), a distributed streaming version of decision trees tailored for sparse data (e.g., text). For clustering, we included a distributed algorithm based on CluStream. The library also includes meta-algorithms such as bagging.
An algorithm in SAMOA is represented by a series of nodes communicating via messages along streams that connect pairs of nodes (a graph). Borrowing the terminology from Storm, this is called a Topology. Each node in the Topology is a Processor that sends messages to a Stream. The user code that implements the algorithm resides inside a Processor. Figure 3 shows an example of a Processor joining two stream from two source Processors. Here is a code snippet to build such a topology in SAMOA.
TopologyBuilder builder; Processor sourceOne = new SourceProcessor(); builder.addProcessor(sourceOne); Stream streamOne = builder.createStream(sourceOne); Processor sourceTwo = new SourceProcessor(); builder.addProcessor(sourceTwo); Stream streamTwo = builder.createStream(sourceTwo); Processor join = new JoinProcessor(); builder.addProcessor(join).connectInputShuffle(streamOne).connectInputKey(streamTwo);
1. Download SAMOA
git clone git@github.com:yahoo/samoa.git cd samoa mvn -Pstorm package
2. Download the Forest CoverType dataset.
wget "http://downloads.sourceforge.net/project/moa-datastream/Datasets/Classification/covtypeNorm.arff.zip" unzip covtypeNorm.arff.zip
Forest CoverType contains the forest cover type for 30 x 30 meter cells obtained from US Forest Service (USFS) Region 2 Resource Information System (RIS) data. It contains 581,012 instances and 54 attributes, and it has been used in several papers on data stream classification.
3. Download a simple logging library.
wget "http://repo1.maven.org/maven2/org/slf4j/slf4j-simple/1.7.2/slf4j-simple-1.7.2.jar"
4. Run an Example. Classifying the CoverType dataset with the VerticalHoeffdingTree in local mode.
java -cp slf4j-simple-1.7.2.jar:target/SAMOA-Storm-0.0.1.jar com.yahoo.labs.samoa.DoTask "PrequentialEvaluation -l classifiers.trees.VerticalHoeffdingTree -s (ArffFileStream -f covtypeNorm.arff) -f 100000"
The output will be a sequence of the evaluation metrics for accuracy, taken every 100,000 instances.
To run the example on Storm, please refer to the instructions on the wiki.
For more information about SAMOA, see the README and the wiki on github, or post a question on the mailing list.
SAMOA is licensed under an Apache Software License v2.0. You are welcome to contribute to the project! SAMOA accepts contributions under an Apache style contributor license agreement.
Good luck! We hope you find SAMOA useful. We will continue developing the framework by adding new algorithms and platforms.
Gianmarco De Francisci Morales (gdfm@yahoo-inc.com) and Albert Bifet (abifet@yahoo.com) @ Yahoo Labs Barcelona
Channel your inner #WonderWoman and discover your coding powers! https://goo.gl/n0TMGq
“I cannot stretch my imagination se far, but I do firmly believe that it is practicable to disturb by means of powerful machines the electrostatic condition of the earth and thus transmit intelligible signals and perhaps power. In fact, what is there against the carrying out of such a scheme? We now know that electric vibration may be transmitted through a single conductor. Why then not try to avail ourselves of the earth for this purpose? We need not be frightened by the idea of distance. To the weary wanderer counting the mile-posts the earth may appear very large, but to that happiest of all men, the astronomer, who gazes at the heavens and by their standard judges the magnitude of our globe, it appears very small. And so I think it must seem to the electrician, for when he considers the speed with which an electric disturbance is propagated through the earth all his ideas of distance must completely vanish.”
“On Light And Other High Frequency Phenomena.” Lecture delivered before the Franklin Institute, Philadelphia, February 1893, and before the National Electric Light Association, St. Louis, March 1893.
ARKit proof-of-concept demo from Trixi Studios applies an Augmented Reality portal with a ‘Take On Me’ music video drawing filter effect through an iOS device camera:
Link
Class Zero
Project from Peder Norrby is an IphoneX visual toy using TrueDepth facetracking to produce a Trompe-l'œil effect of depth from the position of your head:
Explainer video - enable sound! The app, called #TheParallaxView, is in review on @AppStore#iPhoneX #ARKit #FaceTracking #madewithunity pic.twitter.com/6P8ofGZqP4
— ΛLGΘMΨSΓIC (@algomystic)
February 28, 2018
Yes it’s ARKit face tracking and #madewithunity … basically non-symmetric camera frustum / off-axis projection.
The app is currently in review, but Peder plans to release the code to Github in the future for developers to experiment with.
You can follow progress at Peder’s Twitter account here
Project from Universal Everything is a series of films exploring human-machine collaboration, here presenting performative dance with human and abstracted forms:
Hype Cycle is a series of futurist films exploring human-machine collaboration through performance and emerging technologies.
Machine Learning is the second set of films in the Hype Cycle series. It builds on the studio’s past experiments with motion studies, and asks: when will machines achieve human agility?
Set in a spacious, well-worn dance studio, a dancer teaches a series of robots how to move. As the robots’ abilities develop from shaky mimicry to composed mastery, a physical dialogue emerges between man and machine – mimicking, balancing, challenging, competing, outmanoeuvring.
Can the robot keep up with the dancer? At what point does the robot outperform the dancer? Would a robot ever perform just for pleasure? Does giving a machine a name give it a soul?
These human-machine interactions from Universal Everything are inspired by the Hype Cycle trend graphs produced by Gartner Research, a valiant attempt to predict future expectations and disillusionments as new technologies come to market.
More Here
Finding your friends at a festival | by David Urbina for @neonapp. Get notified when the app is released. Music: Seven Lions x Illenium x Said The Sky.
the age of the really useful apps is starting
— Nil (@niluspc)
August 16, 2017
This is rad. Hope it shows up at some festivals soon https://t.co/c9a1W7auEe
— Goldroom (@goldroom)
August 16, 2017
One of the best uses for AR I’ve seen. https://t.co/kxGAUzVyEf
— Alexander Danling (@baobame)
August 15, 2017
Seeing more practical and indispensable use-cases for AR than I have for new apps in quite a while. pic.twitter.com/zwHEGkYZrK via @ARKitweekly
— Scott Belsky (@scottbelsky)
August 15, 2017
Reasons like this are why I think AR >> VR https://t.co/7rt5pRT3o6
— Mohammad Al Azzouni (@mazzouni)
August 17, 2017
I need this in my life! https://t.co/yGbGrWYLBD
— Stefan Goodchild ⚛ (@stefangoodchild)
August 15, 2017
ARKit really will bring a new wave of useful functionality to the phone. https://t.co/H6TT1SlFkj
— CM Harrington (@octothorpe)
August 15, 2017
I love this. Good example of AR solving a REAL problem 👏 https://t.co/6wx3RSwSag
— Sam Clarke (@sclarke111)
August 17, 2017
ARKit is going to empower so many awesome apps when iOS 11 ships. https://t.co/MUaTqbDUb1
— Matt Sayward (@mattsayward)
August 15, 2017
By far the most functional implementation of AR I’ve ever seen. https://t.co/cWC3ymxq9z
— Thomas Claessens (@DeClaessens)
August 16, 2017
This looks mighty useful https://t.co/vh3vTjuVLO
— Max Böck (@mxbck)
August 15, 2017
Impressive (and actually useful) https://t.co/VHdlXzAdGY
— Dominik Schmidt (@sluderndotcom)
August 16, 2017
This is such a good idea! https://t.co/X7xhgB7xeT
— Donna Lowe (@reloweeda)
August 15, 2017
👍🏽 would be super handy https://t.co/9Tk2Q16qnE
— Simon (@liquidmedia2013)
August 15, 2017
Genuinely useful AR coming to a field near you. https://t.co/4M8b92UJLk
— Cennydd (@Cennydd)
August 16, 2017
Find you festival friends with AR - Definitely the coolest implementation I’ve seen so far. App revolution 2.0 on its way. https://t.co/dKDkPRbMw1
— Tom Austin (@tomhaustin)
August 15, 2017
I can’t wait to try his app 😱 https://t.co/YHkZ9F91Zn
— Alexandre Mouriec (@mrcalexandre)
August 15, 2017
This is magical. ARKit demos by the app developers have been 👌🏻. Can’t wait to play with these apps. https://t.co/0RmQ7kkCiE
— KietChieng (@KietChieng)
August 16, 2017
GIMME THAT GIMME THAT RIGHT NOW https://t.co/Hg6fO6GWOq
— Valentin (@valdecarpentrie)
August 15, 2017
This is something I need https://t.co/iVjEkRxCaJ
— Andrew Rodebaugh (@andrewrodebaugh)
August 15, 2017
Less lost folks wandering the festival grounds aimlessly… Love some functional AR! https://t.co/deXJ8nMFQu
— Kent Weber (@WeberKent)
August 15, 2017
Again. This will be a game changer https://t.co/YiN2LQvmU5
— Jens@Gamescom (@JensHerforth)
August 15, 2017
This is a pretty cool use of GPS+ARKit, awesome demo use case! 🛳-it! #ARKit #MapKit #iOS11 https://t.co/LmMjPfo7KW
— Benjamin Hendricks (@benjhendricks)
August 15, 2017
The practical uses of #AR are incredible… this kind of thing will be the norm in the next few years & I can’t wait to test it. #Innovation https://t.co/XdkAdEG11G
— Josh Worth (@JoshWorthh)
August 15, 2017
OMG best use of the #ARKit. At festivals, i spend half my time looking for my friends in the crowd… https://t.co/YPb0AfAFjn
— Julie Tonna (@julie_tonna)
August 15, 2017
Awesome! This would also be cool for something like @ingress / @PokemonGoApp. Ps: love that new iPhone design 😉
— Marcel (@marceldk)
August 15, 2017
OMG !!!!!!!! #Devslopes https://t.co/MzN5RKn1DI
— leonyuon (@leonyuonl)
August 15, 2017
This is amazing! https://t.co/ZrpQBEgaU3
— Shane Griffiths (@shanegriffiths)
August 15, 2017
i just cant stop getting excited by these ARKit demos 🌟 https://t.co/IXAM6N0VBf
— nikhil srinivasan 👾 (@nvs)
August 15, 2017
Just think how much more enjoyable festivals would have been if you weren’t constantly losing/looking for everyone. https://t.co/uzxNJMqI4c
— Neil Cooper (@ncooperdesign)
August 15, 2017
Future killer Jazz Fest/Mardi Gras app for iPhone. (and really every other large gathering where you wanna find your friends) https://t.co/RXkVrLOuQB
— Stephen Sullivan (@swgs)
August 15, 2017
💯 arkit is legit 💯 https://t.co/8h3gWtdMtE
— Sean PJPGR Doran (@spjpgrd)
August 15, 2017
Another cool use of #ARKit https://t.co/0QUrN4BgJF
— Matt Zarandi ⚡️ (@MattZarandi)
August 15, 2017
Now this is something genuinely useful for AR https://t.co/7CvykUc2SQ
— Joel (@joevo2)
August 16, 2017
#musthave https://t.co/4KIhkWghKD
— Gee 🔥 (@Georg_Schmo)
August 15, 2017
This would have come in so handy on many occasions. https://t.co/2jI7uQn1Lf
— Steven Lin (@Stevenchlin)
August 15, 2017
Another great usecase! https://t.co/T5ggr8Qyez
— Schlabbeschambes (@DerHurly)
August 15, 2017
AR is gonna be so cool https://t.co/qmlxshUk03
— Beans (@beano629)
August 15, 2017
This is pretty brilliant! https://t.co/TevMmjBLKE
— Vlad Vukicevic (@vvuk)
August 15, 2017
A 🔥use case here ⬇️ just amazing #ARKit https://t.co/elPyWbW4iO
— Glenville Morris (@glenvillemorris)
August 15, 2017
Now thats a smart techcombi https://t.co/wH8ECU7VxO
— thefirstfloor (@jeroenduhmooij)
August 15, 2017
We gonna be livin’ in 2025 real soon. https://t.co/RgXCAjdb2t
— David Bird (@David_Burns_Red)
August 15, 2017
here’s another super rad use case that would also work for finding your Lyft / Uber driver https://t.co/JVm3oqGrW9
— TIFFANY ZHONG (@TZhongg)
August 15, 2017
Great usage of ARKit! https://t.co/jJ1VDOX4zb
— Elliot Turner (@eturner303)
August 15, 2017
#ARKit (demo) with a practical concept to navigate space and impact social engagement #AR #interactivetech #socialAR https://t.co/2352xf9haz
— Melody Koebler (@melabyyte)
August 15, 2017
Well, that’s bloody awesome https://t.co/XvCLwNsqJB
— Neil Kleiner (@nkleiner)
August 15, 2017
Handy real-world application for #AR. Beats “we’re to the left of the stage” https://t.co/zoMbK4dUSm
— Jon Williams (@yesthatjon)
August 15, 2017
Now THIS is awesome › https://t.co/xP6LamQuua #ARKit
— Jermaine (@dviate)
August 15, 2017
Neat idea. Is it just me or does it feel like it wants a giant column of light like in an MMO or something? https://t.co/SM2dKw80wT
— Gabe Weiss (@GabeWeiss_)
August 15, 2017
Yes and yes! And not just for finding people you already know, opt-in real-time people discovery in the offline world has massive potential https://t.co/zsAQy0q55z
— Shuvi👩🏻💻 (@shuvi)
August 15, 2017
Find my friends on a whole new level #ARKit https://t.co/l53rkXr4PS
— Spencer Bratman (@SpencerBratman)
August 15, 2017
Eyyy this is what I’m talkin about—next to disrupt social media? https://t.co/eN2BSvYXNh
— Kenneth Ng (@KennethLNg)
August 16, 2017
Well this is awesomely handy. https://t.co/KmU4FJvErV
— Dan Z (@danactual)
August 16, 2017
Stop this is amazing!! https://t.co/ZcTy1iAlVt
— Daniel Feodoroff (@mrdanielfeo)
August 16, 2017
Clever! https://t.co/SnjqQD8gL9
— geoff brown (@cgeoffreybrown)
August 16, 2017
Looking forward to way more of this … https://t.co/Qdx0fMK3sh
— Neil Voss (@neilvoss)
August 16, 2017
Just watch this video, one of the best uses of AR I’ve seen https://t.co/OZFjwiIKLP
— Ben King (@kngbn79)
August 16, 2017
AR tinder is gonna be wicked
— Utkarsh Gupta (@u7karsh)
August 16, 2017
Now this is cool! #arkit #ar #AugmentedReality https://t.co/s7E4jkqkpN
— Jen Abel 💬💫 (@jjen_abel)
August 17, 2017
i’ve been waiting for an app like this for a while https://t.co/0uaEwKgtm9
— ✨🌵🦊 🌴✨ (@ryanrogalski)
August 17, 2017
See All Videos
Hands-On Python & Xcode Image Processing: Build Games & Apps ☞ http://go.learn4startup.com/H1iINoD7z
#DeepLearning