Tobias Klein (Tilburg University) - How important are user-generated data for search result quality?

Published: Apr 10, 2024 Duration: 00:47:25 Category: Education

Trending searches: toulouse school of economics
[Music] for search result quality it's joint work with Madina yens and Patricia um when one uh thinks about internet search uh obviously one has to at least keep in mind um that um there's a big player in this market and that would be uh Google here um and um of course um we all know that uh Google's market share is um close to 100% at least the proximately uh what you can see here uh on this um uh figure uh is um how it changed over time so what I put in here is this little symbol so that's um meant uh to depict that cat GPT uh came about and uh changed everything at least in some sense um uh so when could expect well maybe it does something uh to search um uh and and here in February 2023 um Bing uh uh got some uh functionality um based on uh CET GPT technology uh basically uh providing a new type of search um but at least uh until um let's say the end of last year um nothing much has changed in terms of the market share uh for Google but is kind of interesting because some people have at least argued uh that it could have um quite a big impact and of course this could still uh come in the future now um the motivation for this um paper here is um of course what I just showed you um the the market for internet search is highly concentrated um looking a bit back it's actually interesting uh that in the '90s and early 2000s there was some sort of competition in this market uh but then Google um sort of took over um and since then uh Google has been uh uh leading uh here in this market um and um what we um think about in particular here in this project or have in the backs of our minds at least is uh that there's a vicious circle uh going on uh so Google has most users uh these users would um when use um Google um produce data um on past searches so what are these data I will be more explicit um in a bit but um essentially uh Google provides them with a menu uh of search results and they click on um on search results and we can see this as a revealed preference measure for something uh users find um helpful so basically click through rates um so Google has most users therefore most data and for that reason um at least potentially can Pro provide better search results and of course um it's a whole ecosystem uh what Google offers so all this is integrated with other apps um and and and the big question that is um at least around in the discussion for a while now and and we have seen some uh some policy action of course recently uh related to that should the market be regulated and if so how and on the right here you see just a depiction of what I just said um data leads to the US user experience because more data gives better search results um this uh creates um more demand for the services offered by Google and this again um creates uh more data and so on and so forth so what we do here is um to some extent a measurement exercise um obviously the user experiments uh experience depends on many factors um uh and um when you think about it it's actually not so easy uh to to measure um uh uh basically uh these effects that I just talked about so to quantify these effects um we have one uh relatively well- defined goal U for this um paper here what we want to measure is the dependence of the quality of non-personalized search results on the amount of data a search engine has available holding the algorithm fixed so what we want to get at is this uh production ction function um producing search results um uh as a function of the data you put in uh as an input um our approach was um not uh to work together with Google um but we worked together with um a search engine that is much smaller and tried um to enter this Market um and I will tell you more about um their motivation and business model uh later um um what uh this search engine allowed us to do is to keep the algorithm it had um as it was and to vary the amount of data it used as an input to provide search results and then we could look at these search results and um measure the quality and see to what extent the quality depends on the amount of data and for which queries it actually depends on the amount of data um the results are measures of the dependence of search engine quality on data um and we do this by the popularity of the search term or of the so-called query um and that turns out to be really important and and and in some sense um doing things um by popularity of the search um term or query um is is already um a big step in this paper um and and you will see this in the results um as a byproduct um what we get is measures of quality across search engines because um we came up with a way of assessing quality um but this is this is um of course useful to um to see how quality varies when we vary the amount of data we put in as an input but um that's a within search engine um variation uh but of course it's also interesting to look at this across search engines and therefore we also got search results from Google and Bing um and um and also assess the quality of those and this also gives us some interesting results um there are of course some limitations the these limitations are inherently linked to what we're doing and how we're doing it um so uh clicks is all about uh um privacy and this is the search engine we um we uh cooperated with um and and they provide non-personalized search results um or at least this is what they gave us um uh uh that's that's one the second one is um our experiment is we couldn't give them data and ask them you know like produce new results um what we asked them to do is to drop data uh so in that sense it's backward looking um that's a second limitation uh a third one is that to some extent um what we're doing is local to clicks it could be all very different uh for Google um and um we have some indication that this is not the case but we can't be 100 100% sure and the last limitation and I will explain in detail uh why this is to some extent a limitation is that that we keep the algorithm fixed um so what one has to think about is you know if you drop data wouldn't you want to retrain the algorithm and sort of end up uh with the worst parameters um for that algorithm good um let me um if there are no questions um jump into a little bit of background information uh that is just useful um for understanding um what is um going on um so what we see here here is um on top we see um what is a query so I called this uh before the search term this is what you type into either um the address bar of your browser or into um uh the field in in in like a search engine like Google um then um what you would get as a result um is a search results um and um here in this particular example um for this particular example um there are no sponsored um search results so people haven't bid uh on this um uh query um uh but but you see here the organic um search results and and um and and one important thing this is just side note that is really important uh uh from a um uh like effects perspective um there is there is like the the first page that people see on their screen and then there's everything thing that comes down there when you scroll uh what is really really important is what comes on top and we've seen this in all kinds of uh contexts and obviously we've seen papers um uh discussing that now um when people search what is generated is data um so what you see here is um an illustration of that uh so this is um basically what is happening people people type in a query and then they get a set of results and then what um what is kept track of is what they actually clicked on okay uh so um somebody um typed this in and clicked on this URL um somebody clicked uh typed this in clicked here and and so on and so forth so this is literally um a log right um so this is just um recording all the activity uh that has taken place so yes the Lo doesn't include the other the other URLs which were listed on the page um yeah so uh an important qualification is um from my side that we have tried our best um to understand how this is done as far as I understand um this is not recorded um uh you can of course reconstruct it um you're asking like does it record the choice set if I understand you correctly is that right yeah so as far as I understand Um this can be reconstructed but is not recorded um um but um uh I'm happy to be corrected on that one maybe it's is that general for all search engines or is it just for the one you worked with yeah so I'm I'm lacking the the technical knowledge here um um maybe um uh people who are closer to to companies they know more about this but um whenever I've talked to companies um they uh told me that they don't record everything and they don't seem to record so for instance for uh uh search right um they often record uh what people click on but not all the things they could have clicked on and um my my working assumption here is that uh this is also uh the case here uh so my thinking is that um they mainly record um what people did but not what they could have done um for uh yeah they don't want to have exploding data warehouses this as far as I know but um maybe Greg has some insight on this and and can share some later no um but yeah thing I was I know I'm moderating rather than commenting however um I seem to remember going to a presentation by Hal Varian when he was talking about all the experimentation that Google did to improve its search and I think that he said that they also looked at whether people bounce back immediately so they click on something and do they stay there or do they bounce back immediately if they bounce back immediately clearly it was a bad Choice yes exactly I've seen that also in papers um so um one measure of um search um result quality is where people um basically go back and search again right or or click on on two of the results but the last one they click on is often deemed to be the best one so that's one way to think about it but as far as I know uh not all the uh search results um are saved um but again this is my working assumption so yeah um so um these query logs when can take um and um can turn them into counts and this is an abstraction or simplification if you like yeah uh so um uh so this is a query Lady Gaga best songs uh there's another query Lady Gaga hits um and then what is saved and you can think of this a big table um uh URL one 100 times has been clicked two 30 times three once um uh for the second one uh 120 you're old four 10 times you're old five two times um uh so this is what we think of as data and that's that's not um 100% accurate but um we have checked with um with clicks with with people we have talked to there um and and they said this is a good enough an abstraction for what you guys have in in mind um uh it's much more complicated uh in reality uh now comes a new query um and that query has never been there so Lady Gaga best hits so that's a combination or it should be related to these two um and um what a smart algorithm will do is to to figure out basically um how this new query relates to the two old ones for which I have data uh and then it will produce search results and and these search results they they will somehow involve these URL um that um people also clicked on for the other two uh queries because the new query is um uh related to the old queries uh for which the search engine already has some data um so you see this also here it produces some results um and the results are EUR 1 four 2 and five so it's one four uh two and five that the algorithm would deem uh uh relevant um uh in in response to the query now what is the experiment uh that we uh did with clicks uh so let me first tell you about clicks um uh and give you a bit of background information this is also useful for understanding uh the experiment so clicks um uh had the goal um of um doing things differently uh from Google uh so um in particular uh their thing was um to um Safeguard privacy of users uh so Google's business model um in a very abstract way is uh to provide um a good service um uh to its users um and and then uh to uh sell advertising and and and make money with that in order to finance um Finance uh all this all this all this uh uh quality uh it uh it provides to users um now uh the concern uh that that many people raise and and people feel a bit uneasy about is that um on the way Google um collects a lot of data so basically um uh Google is not um known for uh uh you know putting privacy first um and some people are fine with that and others are not fine with that so the idea of clicks was let's try to um uh to uh find a way of preserving privacy um by basically um uh uh moving uh the computation related uh to to showing personalized ads um to the users so so so to to kind of uh not give um one party lots of data on users but to to uh build a privacy um a privacy friendly version of of B Google's business model um I don't want to focus much more on that so because this would be a completely uh different talk um but um what one should keep in mind is that privacy was was the main motivation for doing all that but but yet uh it was important um also to to make money at some point in time um uh clicks did collect data and in fact um quite some data and that's also important to keep in mind um when looking at our results um so in 2017 um Firefox that's a browser that was uh fairly uh popular in in countries like Germany uh where clicks was also based um uh Firefox um had like for 1% of the users um a clicks extension um built in and there was an opt out um uh mechanism behind it so people could uh opt out and what uh this extension would do is to collect data on uh what um users have been doing including when they went to Google searched for something and then clicked on something okay so um this was Data uh clicks actually collected but um in an anonymized way so these data could not be linked back to users um uh one could of course now have concerns about that as well um uh but um uh in any case um it it produced quite some data um that they could use um to build their search engine okay now uh then we got in touch at some point um actually after they had done done this um and um yeah before Co hit but then during Co um uh we did an experiment with them our experiment with them um and um yeah the end of the story is that after this was one thing they still wanted to do uh and two days after conducting uh our experiment they actually announced that they would go out of the market um uh and um they were part of a bigger um uh media firm and that media firm uh pulled the plug and um our interpretation is that uh the management of clicks um and and they they have always been very forthcoming uh they they really thought it was worth doing uh this experiment so they pushed this through but two days later um uh they had to stop um so uh they announced that they would shout shut down this browser uh and uh the search engine uh on May 1 2020 and uh it was a very small firm and some sense 45 employees um they all um found other jobs um quite easily during the pandemic but still uh this is the uh clicks story in a nutshell now um the experiment was now the following so we can think of an algorithm um as having parameters Theta many many parameters um also jet GPT this this model has many many parameters um uh it produces a a set of search results R for a given query q um and a given index I and data in D okay what is the index the index is simply a directory of all the URLs um that are around on the internet um so you can relatively easily um compose such an index put it together by just crawling the internet so that's not the huge um the huge job the huge job is to to get enough data uh to make sense of the index okay so again the search results are a function of the query uh what I type in the index that's just a like a phone book like the directory of all the URLs um it's basically a list of phone numbers um D is the amount of data from past usage um of the search engine and Theta are the parameters what are the usual Dynamics the usual Dynamics are that users use um a search engine they produce data huh so data D here uh grows over time um so you get more data in D and once in a while or maybe even continuously um you can use the data to retrain the algorithm and when can think of this as finding a better Theta okay um so um so Theta in some sense is a function of the amount of data one has available and one can always think of the index as more or less stable over time time and you would update it of course if new websites come about but that's not the big deal as far as we are concerned here um what is the experiment the experiment is now that we drop some of the data in D um while keeping Theta and I as it was for the full data uh and um when we do that uh what we can do is we can produce new search results for the same query q okay and then we can look at them and ask ourselves uh how good are they actually um what does it mean to drop data dropping data means two things um so when you uh look at these two uh queries here and these are the query log counts um uh then um uh when we drop 50% of the data let's say then 100 becomes 50 30 becomes 15 uh and out of one it would be 0.5 but then uh we round downward so that would become zero so that URL would vanish um uh but by and large for large numbers um the relative uh numbers would stay the same okay but of course um the algorithm could still um uh base its results U not only on the uh relative numbers but also on the absolute numbers uh so that that is taken uh on board here um we do this by as I said um by um using buckets uh and these buckets they measure the popularity of um of the queries um that's actually quite interesting we think um uh what we do is we we form first five buckets or they formed five buckets so we followed their advice or they said we could do it like this and we said okay um and then within each bucket we have a th000 queries um and then um uh what we do is um we um drop data so we get 12 levels of data availability um and um and and then we always get new search results uh and this was all run overnight um on Monday uh April 27 uh 2020 now first the buckets for some background information so uh the fifth bucket is the bucket um uh with the least popular queries uh and the first bucket is the bucket with the most popular queries um and um these are 0 2% of all queries um uh and um at the same time what you see here is that uh 11% of the traffic actually comes from these very few uh uh queries um and then here you see the average number of searches per week per uh query um and um yeah the least popular queries uh these are 75% of the queries um each of them is searched on average once per week uh and they generate more than half of the traffic so that's important to know uh background information because in some sense um one can already intuitively uh uh you know uh see that maybe it's easy to to produce good search results here for the most popular queries and it's hard to produce them down here but down here these are the ones that are really generating a lot of traffic uh how did we assess the quality um we did various things we did human assessment and we did uh so-called automated assessment um uh automated assessment is basically to to uh compare the search results um to the ones of Google and to um uh to calculate the percentage overlap um they should probably not be the same um uh when they're equally good but when one can see interesting patterns when when one does that um I don't have time to go into the details here um the main thing we look at is is human assessment so we take the actual search results and we ask people to say you know uh how good are these search results without telling them whether they're from Google Bing or uh clicks and we just give them five um five um uh search results and and and then and then we ask how how good are these search results um and then we also created um Blended uh result sets uh simply by taking all the results uh we we got um and and then asking them out of these five uh which one is actually the best one um and I'll show you a screenshot that it becomes more clear um we do this for a sample because um this takes a lot of time uh all this is centered on Germany uh so uh we um get the result set as if we were uh Googling uh or using clicks from Germany um uh then there can be English uh speaking queries or German uh speaking queries um uh we select the ones with at least three characters um uh and and then as I said we evaluate the top five results um uh for Google Bing and clicks and for clicks at five levels actually of user um of user data um uh because uh also here it takes time and uh we wanted to have reasonable numbers per good uh and for this we used two arays one English-speaking one one German speaking one and uh 587 so-called click workers so think of this as Amazon Mechanical Turk or prolific uh um uh workers um good so this is what we um showed um uh to uh the evaluators um uh related to a query so um what you see here is the results here on the right 1 2 3 4 five um uh then um on the top left um the question how satisfied are you with the results overall so the set of the Five results that could even include um um and even though we didn't say this explicitly but it could include the ordering uh completely satisfied up to completely dissatisfied as if no results at all um and then also we asked people to say you know which ones are the best and and the second best respectively so and we told them in the instructions that they actually should read the text here and they should look at the results uh in order to judge that was part of the uh brief uh so that you know they they were encouraged to actually follow those links good um in the interest of time uh let me jump uh right to the results of all this um so what you see here is for clicks Bing and Google um the average rating and this is really the overall average um uh across um all queries um uh popular not so popular and what you see here is um on average we're somewhere between um mostly satisfied and somewhat satisfied five and six um so we gave them these meanings for for those numbers um and we see um to some extent what we would expect right so Google comes out on top here um Bing is a is a close follower and clicks is less good so this is what probably uh drives um um also consumer choice and so clicks was not very popular and this this could be the reason um for that um the results were simply not as good as uh Google's and we said okay but how does this look when we now um uh differentiate by popularity of the query uh and and what we see then is actually quite interesting we thought um so for the most popular queries um the quality is very similar uh across the three uh search engines yes uh clicks is slightly worse um but Bing comes out even better than Google um but the more you go to the right um the least popular these queries are okay and in particular here for the rarest queries that um generate each of them generates not a lot of traffic but together they generate most of the traffic um on a search engine for a search engine uh what we see is that clicks is clearly much worse um than Bing uh which is again um worse than Google so what what this hint said already is that um it might be actually the popularity of the query that is really really important uh here in the background um the second way we look at the data is this one here and we do this in various ways uh so so with various measures of of of for search result quality but it always looks more or less like this so what you see here is the following we have a one so this is only for clicks now um one is an assessment of the quality of the results uh for the data clicks had available and then by popularity of uh the query so here you have five dots um corresponding to the five um panels I had in the previous um uh figure um so these are the same numbers as we had in the previous figure for clicks um but then what we did in addition was to drop data okay uh so for half of the data we um got more search results uh from uh clicks um and then we evaluated them as well and then also for 20% 10% and 1% uh of the data so what we asked them to do or what what they could do for us was to keep the algorithm uh what it was um and to Simply um drop some of the data to to manipulate if you if you uh will uh the um uh query log counts um and and give us new search results um and what we see then is is that for the most popular queries dropping half of the data is really not a big deal uh dropping uh 80% of the data is not a big deal uh dropping 90% of the data is not a big deal it even produces slightly um well if you just look at the the dot here uh better uh search results but but you also have the confidence intervals here of course uh only for 1% of the data um they do worse um but then uh when you move down here for the rare and rarest queries what you see is a clear a clear slope so um you you drop half of the data and the results are worse um and you drop 80% of the data they're even worse and so on uh so you you get like this this concave um shape out of all this um here uh what we did was um to use the alternative measure which is the average overlap um of the search results um uh with Google um the overlap is not one so is also interesting uh we did this for the top the top three the top five uh as far as I remember um it is not um it is not um 100% um uh what this uh might suggest is that um uh one can probably produce equally good um uh search results at least in the eyes of the assessors without producing Google's search result results huh so it probably means that Google is also not uh Almighty and and you know maybe other search results are equally good uh for the same query um uh but importantly what we see is for the popular queries things are still flat here huh so we we don't see much of a difference in terms of the shape um and we get this gradient back um we get this gradient back uh for the rare and rarest queries um what we see here is uh another way to look at it um uh so this would be the position of the best rated results um what our analysis gives us um by Design uh is um actually the best um URL um for a given query uh if we just uh take a step back what we did was we um we uh T took a query uh and we put it um to clicks um they gave us result um then we gave it to Google and uh to Bing and then we took all of these results uh and created um a list of candidates that are most likely that is that was most likely to contain the very best result and then we asked our assessors to tell us uh which result in their eyes actually the best one was and what what this gives us is um uh this figure here if we then analyze it so for 100% of the data um the very best result um uh in the eyes of the users this was the the fraction of times this was actually shown by clicks okay so in that sense uh this is Meaningful um and um yeah so it was also shown as the best result here right um and when we have less data as you can see here this was less likely to be the case Okay so the yellow area is basically growing this this means that the more data clicks as available um the more likely it is to to show the very best result now you can say hey uh is is Google Now Almighty or not uh this is a byproduct of our analysis what we see is that actually for Bing and Google it looks like this okay so um here uh this is this is what Google produces so this tells us uh that the very best um uh that the very best uh result in the eyes of our assessors is not like um uh the one that Google always shows but of course the likelihood that Google would show this is is bigger than four Bing and from the other analysis uh we understand that this is mainly driven by the queries that are rare or rare rest uh so this is how we go in some sense full circle now um so one way to think about these results is um that um it seems uh that as if a small search engine can actually produce non-personalized search results that are as good as Google's if it has access to enough data on past searches so we can arguably learn this from looking at the popular queries um uh we know that users care about search result quality um so in in a way um uh the market for internet search is data driven or in in some sense the data the data drivenness comes into play because um uh the search engine does not seem to have enough data for the rare queries um to make use of its algorithm and produce uh good search results there uh and and this hints at uh mandatory data sharing um uh as a policy reaction which of course um has uh in the meantime uh uh been implemented in the digital markets act um and in that sense our paper provides um some empirical justification for doing that um more generally um what we like to think is that our approach can be applied to other markets and services as well um our measurement approach huh to to basically um uh vary the amount of data produce um some sort of results search results um uh and then uh to to measure um in in some way uh how much consumers um like those results that's uh what I had thanks a lot well well you make moderation very easy by being a extraordinarily clear um but B extraordinarily on time so thank you very um and I'll hand over to Greg for his uh for his discussion uh Greg I realized I didn't actually introduce Tobias but he may have done that already but is from tilberg um Greg is from the University of Zurich but I hadn't actually realized until I got the email for this session that he's also the chief Economist at zelando so he may describe what that is but I dare say that means that he has to deal with search results actually in reality these days and all of a specific sort so Greg yeah so thanks very much to uh to you all for the invitation and Toby for the paper I really enjoyed reading it um I too ran an experiment which is so for those of you that don't know zendo is uh I I'm a shoe salesman now so we we we we sell shoes and clothing throughout Europe not so much in the UK but especially we're in 25 European countries so um H so I also read an experiment which is I read the paper and I thought okay so you know what are the comments I would give and then I simult simultaneous or I subsequently reached out to three of our uh applied scientists that are active in our search uh products and uh uh sisted there very brief thoughts um to be fair I you know I I was trying to be conscious of of their time and I can confidently say that uh the quality of their contributions is higher than the quality of my contribution although you'll get Toby the the uh the bundled product you'll get the the the full mcoy so so let me start by saying that um I think this is just a hugely important research question right and and this is from this you know and most of what I'm going to say now is with my Professor hat on and sort of the you know when you know previous to joining zando I of course was uh fairly involved in in European you know interest I took an interest in in policym in the space um and I continue to have that interest and so this really is important because if we want to have more competitive outcomes in search markets we need to understand what are the factors that drive competition and drive those outcomes and there's um at least in the economics literature not very much evidence in this area and so Toby I think what you and your co-authors have done is a super important First Step so um so whoos to for this and keep going um let me say then that um when I was reading the paper my first comment actually was some was also a comment that uh two of the other uh search applied scientists also came up with which is what do we mean here by data right so so data I mean in the paper there's this idea that more data is better data sort of data is sort of a homogeneous idea um but in the in the when especially some of the uh applied scientists the search appli the data scientists were thinking about it um I think is it enough is it just more data is better and and and so in particular you know you yourself mentioned these are the non-personalized search results so how important is personalization for search results and um uh you did say this is what they gave us right so that suggests that maybe there's a dimension of search result quality that had you Incorporated personal no don't get me wrong Toby I I'm raising this question and I fully realize you may not be able to do anything about it but just to be aware that how Specialists might think about this and how then you want to craft the paper and ultimately you know if you're speaking the policy makers about it because I think that's really where the payoff of this paper really lies is is is the potential to hopefully policy um and and in particular one of the one of the applied scientists mentioned actually well more data here might even mean fresher data right and so freshness in a in a search context can be quite important something I hadn't thought of so just to Fe that back now now why I think that that these comments about what do we mean by data are actually quite important is is a little bit how relevant is the experiment you ran for the policy space that we hope that it speaks to right because um and especially where I got a little bit lost is I understand that the algorithm was trained on the full data but then the predictions from the algorithm were um uh the experiment was to to increase or decrease the amount of data on which that algorithm was ultimately going to be able to use to make its its ultimate prediction and that map was not fully clear to me neither in the paper nor in the in the presentation which is totally reasonable I mean I'm sure it's not simple um and I realize these guys no longer work a click so it might be even hard to figure that out um but I do think that's important in the sense of you know can this experiment inform policym like is it relevant to the actual environment in which competitors got access to data from say a Google or a bing um um because at some at some level you know maybe the the maybe this is your results could have been in some sense far worse right is if they had actually trained the algorithm on this smaller set of data and then also use the smaller set of data to um and so so so to the extent you can address that I think that would be super interesting let me say the final comment actually comes is sort of a blend of what one of the the most Junior applied scientists it's so I T I talked to sort of a a senior leader a mid a mid leader and then the junior guy and the junior guy has ever the most detailed comments and best in this case um in my view and and that is so he pointed me to something called neural scaling laws okay and so this is um you can look this up on Wikipedia but it's it's for a class of machine learning models based on neural networks and it says there are four key variables that that relate to each other in terms of where one of the variables is just model performance so that's sort of the outcome variable but there are three key drivers of of model size training data size and computational cost right and and at some level I think your paper speaks to two of those right you know model size and training data size but not so much comp I mean if you like you can say well model and computational cost can be bundled together and call that your data I think that's fine but the the issue here and so that so I know very little about this he pointed me to a few papers I'll show these paper with you you can you know then come up just a little bit on this but that um the that all three of these things matter so for example he shared with me that for open AI if you incre they learn that if you um if you increase the computational complexity of the model by a factor of a you only needed to increase the amount of training data by a factor five right so that you know um these are Big factors obviously um but but there are these there are these there's actually previous research if you like in the in the data science machine learning space that maybe speaks to the exact same question that you're looking at that you can then maybe rely on and learn from right and and but now here's where I want to add something which is this is unbelievably important from a policy perspective right because at some level your paper's talking about data but imagine um clicks or some other competitor had access to all the data in the world that that all the data that Google or Bing has if that's not enough to actually generate competitive results if you also need that model complexity which sounds to me like applied scientists and the computational costs which sounds to me like cloud computing like and big budgets for it then data sharing alone won't actually work in this space right and so that so we need we economists need to understand the technology in in all of these markets but let's start here with sech if we actually want to help Foster competitive outcomes right and what I you know and this these were like five minute answers from from from people that work in this area immediately highlighting hey maybe it's not just the data here and and and now it may be that your paper can't speak to that but I think when you craft the policy section flagging that and saying look we haven't touched on these other two aspects which may be as or even more important or or you know these are compliments that you need all of them in order to ultimately generate competitive outcomes because what we don't want or at least what I don't want is someone to say all all we need to do is give them the data then somehow we you know through a huge political the data becomes available and it doesn't make a difference right so if we're going to design a regulatory intervention we have to design one that actually could and so um and I'd be happy to put you I mean if these guys are interested put you in touch with the people at zolando or whatever you know and tying into you know so I'm happy to help you know move you in that space because I do think it's critically important anyway I stopped there I know I ran long so I made your job a little harder Amilia but

Share your thoughts

Related Transcripts

Guy Aridor (Northwestern University) - Evaluating The Impact of Privacy Regulation on E-Commerce ... thumbnail
Guy Aridor (Northwestern University) - Evaluating The Impact of Privacy Regulation on E-Commerce ...

Category: Education

[music] yeah thanks tiffany um yeah thanks again for the organizers for uh inviting this paper and thanks again in advance for to garrett for his uh comments at the end um so yeah so i'm going to talk about our paper evaluating the impact of privacy regulation on e-commerce firms evid evidence from... Read more

2nd Health Economics Conference - The Use of Health Data, Platforms and Digital ... | Round Table thumbnail
2nd Health Economics Conference - The Use of Health Data, Platforms and Digital ... | Round Table

Category: Education

[music] [music] okay we we are going to start on time it's wonderful um we have a very exciting round tr table uh i look forward to it i'm going to learn a lot from it personally and uh we have three really uh so the round table is about ai digital health and uh both innovation and medical practice... Read more

2nd Health Economics Conference - Which Pricing and Reimbursement ... | Round Table thumbnail
2nd Health Economics Conference - Which Pricing and Reimbursement ... | Round Table

Category: Education

[music] [music] okay so thanks for coming for this first round table so as you know we started this conference last year and with jean we thought okay we need to have run tables so that we discuss policy questions uh so last year we had two run taes where uh we were discussing some policy and this year... Read more

Daniele Condorelli (University of Warwick) - Buyer-Optimal Platform Design thumbnail
Daniele Condorelli (University of Warwick) - Buyer-Optimal Platform Design

Category: Education

Intro [music] um so uh so what this is about uh the paper is called buyer optimal platform design so let me explain the main idea uh behind this so uh uh we want to think about the platform as matching platform here so and uh uh when you look around how uh a number of these internet players have developed... Read more

Together, let's bridge the gap (new brand clip from Toulouse School of Economics) thumbnail
Together, let's bridge the gap (new brand clip from Toulouse School of Economics)

Category: Education

The world is full of economic challenges. the question is not whether to face these challenges, but where to go to best face them. here, you will become a true groundbreaker, a true changemaker. so bridge the gap between your ambitions, your skills and the future you believe in. bridge the gap and... Read more

CGS 2023 - Interview Zohra Bouamra-Mechemache thumbnail
CGS 2023 - Interview Zohra Bouamra-Mechemache

Category: Education

[musique] si je devais retenir trois mots sur le débat sur l'alimentation durable trois mots qui ressortent c'est protéines végétal légumineusees lentill et coconstruction pour moi c'est très important d'élargir le débat la société civile aux entreprises parce que c'est eux qui sont acteurs de notre... Read more

Tony Ke (The Chinese University of Hong Kong) - Information Design of Online Platforms thumbnail
Tony Ke (The Chinese University of Hong Kong) - Information Design of Online Platforms

Category: Education

[music] recording the floor is yours thank you very much alex um thank you for having our paper so the title of my presentation today is uh information design of online platforms so we are motivated by the observation that online platforms have gather a lot of consumer data and this data and data analytics... Read more

FIT IN Initiative Conference | Toulouse, May 2 & 3, 2024 thumbnail
FIT IN Initiative Conference | Toulouse, May 2 & 3, 2024

Category: Education

[music] welcome everyone i am so i'm milo bian i'm a professor here at tc and i'm the director of the ptin initiative so as you as you know the initiative was launched about three years ago a bit more than three years ago and the one of the main objective of the initiative was to uh research to to promote... Read more

ITW with Colette Laffont - Transition from association Jean-Jacques Laffont to Giving to TSE thumbnail
ITW with Colette Laffont - Transition from association Jean-Jacques Laffont to Giving to TSE

Category: Education

Je m'appelle colette laffond j'enseigne les mathématiques à tse et l'université toulouse capital je suis l'épouse de jean-jacques laffond qui est à l'origine de l'école d'économie de toulouse et l'association a été cré en 2005 après le décis de mon mari il était très intéressé par les problèmes des... Read more

CGS 2023 - Interview Bengt Holmstrom thumbnail
CGS 2023 - Interview Bengt Holmstrom

Category: Education

[music] and the common goods problem is that that a lot of people enjoy the same thing and like nature or climate or whatever and and uh and but any given individual won't pay for the full cost obviously and and so there's a free rider there's what we call a free rider problem there certain unwillingness... Read more

Les parcours de licence Economie et Economie-Gestion thumbnail
Les parcours de licence Economie et Economie-Gestion

Category: Education

[musique] en ce qui me concerne j'ai fait un bac général spécialité mathématiques et sciences économiques c'est sociales j'avais une moyenne qui gravitait autour de 16 de général et en ce qui concerne les maths je gravitais autour de 13 14 et pour moi j'ai fait un bac général option mathématique et... Read more

CGS 2023 - Interview Estelle Malavolti et Benoit Lanusse thumbnail
CGS 2023 - Interview Estelle Malavolti et Benoit Lanusse

Category: Education

[musique] tr mots sur ce débat le premier démocratie plus de démocratie par rapport à par rapport au choix mais aussi par rapport à l'implémentation parce qu'encore une fois faut vraiment faire cette distinction entre le moment de la décision et puis le moment de l'adaptation et la réactivité quand... Read more