A little over a year ago, I wrote a 6000-word retrospective, A Year of MLC: Selfish Takes Only, reflecting on building ML Collective, the non-profit and non-traditional researchers community, for a full year.
Before I could blink, another year has passed, and we found ourselves at the two-year mark of the birth of a one-of-a-kind organization. (My uncle and aunt have a tradition of taking pictures on their wedding anniversary, and my uncle would jokingly add a caption to those photos: “Still Married!” In that sense I want this series of posts, for however many years that I’ll be writing, to be subtitled: “(MLC is) Still Alive!”)
I am late in posting this year’s MLC retrospectives, although, there’s not actually a clear timestamp of when exactly ML Collective was founded. The idea of a distributed research lab, and setting it up as a non-profit company, probably entered our minds somewhere in April 2020, when I was out of my previous job and rejected from every single place I interviewed — and found myself having to think creatively what to do next (more covered in the first 30mins of this talk).
Jason and I lived in the same neighborhood then, and would take masked, socially-distanced walks while experiencing the first ever lockdown and pandemic of our lives. On one of the walks, I broke the news of my last job search failure: “This is it. I’m officially jobless.” We happened to be approaching an intersection. He stopped me and said: “One day you’ll look back on this day and remember this intersection, and be so glad of what happened from here on.”
He’s right, as always.
Photo taken April 17, 2020 (likely the "intersection day"). Lower Haight, San Francisco, USA
Last year, I wrote about all the “selfish”, personal takes from running MLC: how it changed my own life for the better. Many things remain the same this year, but also a lot have changed. I’ve both gained and lost insights, and like some of you, struggled to keep afloat at times. This post is once again a personal retrospective: no growth of numbers or showcase of dollar signs, just thoughts and feelings. I am not speaking on behalf of MLC, but only myself.
The tagline this time is “My Protests,” for it happens to be the phrase that best summarizes what I do — not only for work, but for almost everything that contributes to my identity. I’d like to think that my current, converged way of taking a chance in the world to work and live, to position myself for the future, is to a large extent a “protest” to what’s seen, described, celebrated and promoted as the “mainstream,” “tradition” and “standard” in society, at least in the bubble where I currently live.
In terms of content, last year I had 5 sections (of selfishness) — “The achievement-minded,” “The creativity axis,” “The better connections,” “Before I am ready,” and “A more sustainable view of philanthropy.” This year I again have 5 (protests). Although, actually, I had initially drafted 9, and had planned on each of them being short and sweet to not make this post twice as long. As the natural course of writing goes, I ended up having more to say on some and finding others less well founded. So in a poetic coincidence we ended up again with 5 insights: “The unglamorous,” “The human element,” “Talented and underprivileged,” “Inclusive or exclusive,” and “Small scale problems, at a large scale.”
For context, I start with a recounting of updates happened both in my professional life and with MLC over the past year. If you know me and my life well, you can skip ahead to the Protests section for this year’s insights.
- Personal update: Google, CSS, and more
- MLC update: Office Hours, Town Halls, and more
- My protests
- Tiny difficult things
I briefly mentioned in this post that I joined Google Brain last April, so the entirety of the second year running MLC is on top of me having a full-time researcher’s job at Google. “How do you do it?” Many have since asked me, “How do you manage having two jobs?” “Well, I don’t.” is my truthful answer, “I just accept that I fail at managing either.” This is not a humblebrag — I honestly don’t think I’m doing a good job at being either the kind of Executive Director of MLC that I know I could be (judged by what growth and developments we could’ve had), or a researcher at Google (judged by the standard research scientist rubric). But the good fortune permitting me to get by, for now, is that both jobs come with rather flexible definitions.
The job at MLC is flexible in that it is really up to Jason and I. We don’t have to listen to investors (yet) or a board (of anyone other than ourselves), so my rather limited work output and the resulting minor stagnation is tolerated. Although, to be fair, we do feel that we have to listen to the community. It is a strange land to be operating a company that is itself a community — members in the community are at the same time your customer and your colleague, your audience and your spokesperson. Meeting the community’s needs is the sole purpose we are fulfilling, and not being able to fully do that to my satisfaction is my biggest guilt and source of failure.
The other job, a researcher at Google Brain, thankfully, also comes with enough freedom and flexibility. A research scientist is expected to both do good science and actively contribute to the wider research community, the latter being where the MLC work fits, as well as my other work serving the academic community (e.g. speaking at, reviewing for, organizing and chairing academic conferences). “Fostering the development of the next generation of scientists” is also listed as one of the goals in Brain’s approach to research, which is very much aligned with MLC’s main functionality. And I still publish a lot.
So, that may explain how I’m getting by with “both jobs,” although in fact I would just call them “one career.” Many people walk around wearing many hats, and we all have a portfolio of beliefs, interests, and goals, but not all of us can have the opportunity to pursue them all at the same time. Sometimes we shelf a dream in the service of a more immediate need. For that I am simply grateful that there’s an arrangement that allows me to grow both in my scientific expertise and passion in community service.
In fact my biggest commitment and workload of the year was neither MLC or Google. It was serving with Krystal Maughan as the DEI chairs of ICLR 2022. It’s such a special appointment to me as ICLR is my favorite conference (for one, I chose to announce my new job at last year’s), and DEI is something I deeply care about. The workload of being involved in an academic conference (GC-ing, PC-ing, SAC-ing, AC-ing or reviewing) is usually predictable — every role has its duty pretty well defined and expected — and so it is for DEI chairs: the common workload involves attending all the organizing meetings throughout the organizing of the conference, providing suggestions and recommendations especially with regards to increasing diversity, ensuring equity and inclusion, on topics like speaker invitation, attendee outreach and accessibility, etc. The job — more accurately, service — is to safeguard, assist, and support choreographing a conference; to minimize the risk of any harm, with relatively little ask for innovation.
However, perhaps both Krystal and I were harboring too much frustration, anger and ambition when it comes to the state of affairs with DEI — too much talk, too little accomplished, the first agreement between us was that we’d like to commit ourselves to making real changes. “Let’s just directly put resources into helping underrepresented individuals do research, submit and publish” was the simple idea behind our 8-month CoSubmitting Summer (CSS) program, launched with the call for “Broadening Our Call for Participation to ICLR 2022.” The experience was quite educational, fulfilling and rewarding in the end, but not without pain, difficulty and guilt throughout. We wrote up a concluding blog post, “Reflections on the DEI initiative at ICLR 2022,” summarizing lessons and memories from running this program.
As I am writing this post, Krystal and I have both received and accepted a renewed invitation from the General Chair to continue to serve as DEI chairs for ICLR 2023. It’s going to be really exciting as ICLR is returning to in-person (after 3 years of virtual conferencing) in 2023, and to be held at Kigali, Rwanda! Both of us are incredibly honored, excited, but also not without trepidation. We will rethink what the renewed program is going to look like, with the same goal of ensuring a positive start in one’s first taste of research (helping them make a submission for the first time), as that small experience (how helpful or welcoming a community is), and feelings associated with it, are known to predicate one’s future trajectory. At the same time, ensuring a positive start does not mean that we are withholding the full complexity and difficulty of such a trajectory — research is hard and collaborating with people on it both eases and complicates the matter. So it looks like my next year’s workload is already pretty much charted out. I’d probably have nothing new to summarize a year later.
The most celebrated way of looking back on a year of progress, of presenting an exciting summary, is by answering the question “What’s new?” — this is how we greet each other these days, hoping to hear something unexpected, unthought of, out of the normal distribution. But I’ve come to realize that the hardest part of running an organization (or managing one’s own personal life, really), the part that takes the biggest commitment, is not coming up with new ideas but rather keeping the “un-new”, unexciting, the humdrum of daily activities going. The sheer effort of maintaining and strengthening existing structures and processes in an organization — preventing it from collapsing — greatly outweighs any effort to bring in a new idea.
Don’t get me wrong, it’s not like there’s only labor and no innovation involved in maintaining things. On the contrary, a much larger number of innovations and sparks of creativity are needed in maintenance, only each of them is not as noticeable. I can assure you that it kills more brain cells to keep a 5-year reading group alive than putting together a new office hours initiative, or writing a new workshop proposal. I learned that even the strongest arm atrophies quickly, if one doesn’t keep up with the training regimen. What that means in the context of event organizing is that just because you’ve had a good audience before, it is not guaranteed that they will come back for the next iteration of the same thing, if you don’t pay a close attention and make needed adjustments. Most types of work feel like a child that you have to take care of, but more so in event organizing than others: events grow and evolve a personality; they even have an environment (the audience) that they interact with, and as a result, a wise parenting move is not to arbitrarily or aggressively control such a personality development, but rather read the room and understand the dynamics first.
MLC is all about people, and people are glued together by events. As a result, participation and engagement are the main metrics we care about. However, to be honest, it is rather draining to be in the business of event organizing, where a real-time audience gives you feedback right away. I sometimes liken it to financial trading, where the feedback loop is short (compared to research, at least), as your profit and loss are almost immediate. I’d also liken it to a customer-facing business, where all business strategies are geared towards customer satisfaction. And the harrowing difficulty lies in that customer happiness is hard to drive, let alone optimize, and the audience gets bored easily.
To stay keen in the ever-shifting environment, it’s worth noting a few trends that affect event-centric organizations and businesses such as MLC:
- “More choices than ever”: a trend towards a more open and accessible culture, which brings on a market of varied sources of information. While the size of content absorber grows much slower than content distributor, the competition on audience attention is fierce.
- “Easier than ever to leave”: a trend towards more virtual or remote engagements, which take little effort to join and leave, and therefore it is much harder to maintain a consistent audience. The volatility in audience size is going upwards due to the low commitment requirement.
- “Demanding more than ever”: a trend towards more democratized content distribution, which makes it easy for anyone to be a host, an influencer, a voice of impact, or a leader. It at least inspires all of us to want to take a more engaging role in events: no longer just passive attender, but an active contributor. It is a positive movement in that we get more do-ers, but it also means that the old event format — small number of active contributors, large number of passive attendees — no longer meets the needs of the majority.
Recognizing these changes and their effects makes us feel the need to subtly alter our goals: from “having the absolute best reading group” to “having the best platform that potentially helps breed a number of great future reading groups.” It is a change from trying to be the best YouTuber to trying to be YouTube itself. We are doing this in a number of ways:
- Perpetuating our open-access format. All our events give everyone the right to speak and take over the stage, never just passively attending.
- Our radically participatory planning and execution event, the Town Halls, encourages everyone to act as an owner of the community, to be “in the room where it happens.”
- Our Office Hours program has grown rapidly from 3 hosts to 9 hosts in 4 months.
- Our push towards self-started, self-organized Interest Groups.
MLC is still a research lab at its core, and we still want nothing but to produce the absolute best science. But in order to do that, an essential stepping stone is enabling people to produce the best science. The nature of us being in the humanity sector (as an alternative term to non-profit) means that we will always focus on people first, with the knowledge that it will naturally lead towards also focusing on science. More about this will be discussed in the The Human Element section.
All my learnings this year have started organizing themselves around one theme: you really have to do things like no one else. From the economic point of view, this makes sense: if you simply follow what everyone else is doing, unless you are among the absolute best, there’s little business value for you to exist in the market, and hence no edge or advantage to drive your success. But this is also a nuanced statement; in every single aspect of either a business or a personal life, there is a choice to be made of either being a trend follower, or a contrarian. Not only is it impossible to stick with the same philosophy (e.g. “be a contrarian”) everywhere, but it’s also not necessarily most profitable to do so.
One often chooses an action that’s aligned with their safety level at the moment — following the convention always feels safer and requires less courage, and going against the tide is risky. And on each of the dimensions in life where we feel like having a choice, there’s a comfort region within which we perform in accordance to our safety level. One of the most common self-improvement advice out there is “going out of your comfort zone”, but stretching too much beyond it can leave one prone to anxiety and exhaustion, leading to overcorrections down the road.
So I try to be careful when I say I’m fighting against all the conventional values — I am not; most of the time I’m still a convention follower. And instead of “fight” I would use “protest”: that’s the difference between waving a flag in your face asserting that you are wrong, and living with my own belief and trying to make it seen enough, heard enough, so that it can be a case study for others to cite in their own non-traditional survival guide.
In a lot of ways I am still traditional: I care about publications, citations, academic respect no less than I did before; I crave confirmation, recognition and proof of worth not much less than anyone else. But even with a largely unchanged objective, there are small regularizations we can add to make it not a solely, greedy path of movement towards the goal, and sometimes counterintuitively, to make the original objective easier to optimize. Below are my regularizers for this year.
I live in San Francisco and I work in AI research — perhaps one of the most glamorous professions at this moment in history. So it might sound ridiculous for me to say that my No. 1 protest is that I “work on unglamorous things.” But with certain context normalization, I’d insist that it is the case. In particular, I see these as some of the current embodiments of glamour in our field: the splashiest paper that everyone is talking about, the hottest startup that everyone wants to join, the most amount raised in series-X… and I am certainly not involved in any of them.
To be clear, I’m not dismissing glamour. Some successes and breakthroughs deserve to be celebrated, and in some cases it can be the right reward structure to draw ambitious people in. However there are at least three major problems with the current glamorization in AI/ML:
- As a society we tend to associate glamour with superficial, easily quantifiable measures that are far from representative of true contributions, or comprehensive enough to embrace all valued dimensions.
- Anything under glamour already has drawn a lot of people in and hence has become a fierce battleground, breeds anxiety and chances of failure.
- Reading only things with glamour gives one a false impression of the field overall, a typical manifestation of the survival bias.
In the end, a glamour-seeking mentality creates a community that’s short-sighted, narrow-minded, and growingly self-obsessed, and such cancer affects every one of us in this community, whether we consider ourselves actively in pursuit of such glamour or not.
And I’m no exception. Swimming in this water, I constantly feel like questioning my choice, my competency and worth, because what I do moves me away from the usual means of external validation. But I know for a fact that the society does not need any more suitors of external validation. And that’s why I feel the need to recapitulate — more to myself than to others — the importance of focusing on the unglamorous, simply because, well, other people don’t. A universal rule here is that if you find yourself stimulated by something (and the measure of success of that something is rather shallow), it is likely that a large number of people out there are also stimulated by it, and have rushed towards it. Recognizing that gives you an opportune moment to consider the opposite — what is something that’s as valuable but not properly rewarded by the current system? Maybe that’s something more worthy of investing.
I happen to be doing some of those inglorious things, so I’ll be talking about them, but don’t let yourself be limited by my examples; the list of glamorous objects is a finite list, but a whole world of possibilities lie outside of the list.
First, the non-profit business just does not come with the usual qualities of shininess that Silicon Valley promotes. Because it is not directly associated with hardcore technological breakthroughs or straight-up dollar signs, the whole charity sector gives a soft, uncool, almost feminine air that’s alarmingly foreign to the superhero vibe that other “tech bros” emit. The daily work is indeed, not as “cool” for the tech-centric definition of cool that’s strongly associated with masculinity. In any other environment, a profession centered around “helping people” might have a chance to gain respect, but in Silicon Valley, this line of work is so not aggressive or cut-throat as to permit any special charm.
Second, community organizing is rather low-tech (or non-tech) and chore-like. The daily work mostly amounts to answering requests from community members, directing them to places where they can find answers, as well as documenting and improving the process so that it is sustainable to both members and organizers. Every day feels like a thousand loose ends flying around in our face and our job is to tie enough knots in order to weave a legible pattern out of them. Without going deep into a cause-and-effect analysis, let me also point out that women and other minorities do disproportionately more administrative, coordination, and organizing work even when bearing the same titles as their peers.
Third, a lot of my current efforts fall in the category of “meta-science”: coordinating efforts and designing structures for the efficiency, longevity and accessibility of science. The only time this kind of work is highly regarded is in high- to executive-level management. Junior to mid-level researchers taking on this job earlier than when it’s normally prescribed often results in it being a thankless task.
Lastly, being a human connector. It is both something I love doing and hate how it’s never properly appreciated. It started from an earlier version of myself with lower self-esteem, thinking I was not someone worth knowing, but if I happened to have great friends, they certainly would benefit from knowing each other. By now I can keep doing this from a more enpowered, confident place, and from that place I fully recognize its value and can let go of the vain — I know I’m doing something highly valuable and I am ok with the credit never attributed back to me, but still, the lack of recognition for these efforts can be disheartening.
“My protest” has two meanings: 1) I fully stand behind this action and will keep doing it, and 2) I am not satisfied with how this action is recognized and will be an advocate for it. Working on unglamorous, under-recognized, but effective and necessary work falls into this box nicely. It might not feed you a sense of accomplishment right away, but that’s exactly why it is the best — it teaches you to self-motivate and search for intrinsic validations. One day that will be your most valuable toolset.
We live in the information age, the digital technology era; as a consequence we probably have never been farther away from our human cores. How often do you think about the human side of things, when you see a development, a phenomenon, or a problem? If not worldly things, how about when seeing yourself in the mirror? Do you see the actual human that you are being, or the projects, emails, to-do lists, the KPIs and OKRs?
Another example has to do with deep learning (of course). For the past 15+ years, the revolutionary progress of deep learning is commonly attributed to three factors: data, algorithm and computational speedup. However, a key element has always been left unmentioned, which is that prompted by the initial success, a huge number of students, engineers and scientists poured their hearts into making it work better and better. And this number is still growing every year, which is the main if not only driving factor for continued breakthroughs.
Humans: the single most important but also most often neglected factor behind every single best achievement, as well as every worst mistake. (It is also behind every single best or worst personal experience we as humans could ever have. Think back to your best moments, your happiest self, and I’m sure there’s some human by your side. Recall instead your worst nightmare, your biggest heartbreak and fear, and there’s certainly some human you’d want to blame.)
But of course. Everything I talked about here is within the context of a humankind civilization, so humans are of course behind everything — why even bringing it up?
Because at this particular moment in history, humanity is technology obsessed, so much so that we ponder problems and solutions often only in a technological framework. Moreover, I think a good amount of us went into science and technology in the first place partially to avoid directly dealing with people — it is indeed a much messier subject to study or profess. However, the further you are in your self-actualization path, the more you’d realize that it’s all about humans after all.
You gain an enormous advantage if you start to look at common problems from the human perspective. That amounts to switching from “why does this nasty problem exist” to “why do humans let this problem exist” or “why aren’t humans motivated to do better.” Problems as big as wars between nations or as small (but critical) as the brokenness of the scientific review system can all be reduced to this simpler core: some humans involved are not properly empowered to do the right thing.
From that angle, solving any problems requires first understanding humans: what we want (to be seen, to be heard, to feel good and fulfilled), what our foibles are (hubris, jealousy, competition, greed). From there we can further understand when we act in a good faith (when our needs are met or about to be met), and when we choose to act poorly (when our needs are not met, when operating under fear, insecurity, or detachment).
Seeing the human element prompts us to ask these questions when seeing how people’s actions differ in a given scenario:
- When you see a “bad actor” — ask not who they are or why they do this, but what happened to them, what the environment is doing to them.
- When you see a “good actor” — ask how they are inspired, empowered, what privileges and opportunities were given to them.
Just by starting to incorporate this angle into your daily problem solving routine, you will get much further ahead than most technology-centric people, and let me add that we have enough technologically savvy folks as world problem solvers, but we do not have nearly enough humanitarians.
When Jason and I started a science organization in 2020, that’s what I thought we’d be dealing with everyday — science. But no. In the end, running an organization is a people business, and the human element has to be the central topic on our mind. The daily management of MLC, and more importantly how MLC is positioned to improve the world, are both entirely people-first: we want to lift up researchers by creating a better environment for them, and empowering them to do the right thing.
There’s a difference between project-centric institutions and human-centric ones. The former asks: “How can we make a team, and work in ways that maximize the outcome of the project?” The latter is all about “How can we ensure all individuals involved have the best experience possible, one that sets them up for future success and happiness?”
It doesn’t sound superior or cool, shifting from doing science to dealing with people, but it is again my protest that we need more uncool people-skilled thinkers and leaders.
If there are two words that summarize whom we build MLC for, it’s these two: talented, and underprivileged. (I sobbed realizing that these are the exact words I’d use to describe my dad.) Talent points to one’s innate ability, and privilege is often reflected by external circumstances. While the latter is relatively easy to understand — we can all tell whether a person is from a privileged background, whether they have a certain unearned advantage in society in comparison to others, talent is much harder to gauge. One’s self-perceived talent is usually off; we all start from a place where we either over- or under-estimate ourselves. Meanwhile, others’ perception of our talent, ironically, is usually distorted through their impression of our privilege. I spoke about this in my talk from two years ago, and I’ll reiterate: “talent” is overly attributed in a person’s success; a lot of what makes someone someone is the access of opportunities, where privilege plays an outsized role.
Simply speaking, when you think you are evaluating a candidate’s talent, you are often distracted by proofs of privilege. Even the “most evolved or enlightened” of us suffer from this bias. I do not trust any human being to be bias-free, and even less, having full self-awareness. I may trust our ability to point out others’ biases, even when we tend to ignore our own. Therefore, the only way to mitigate group bias is a diversified committee where there’s hope that we call each other out. On the other hand, the worst thing is a committee composed of people possessing the exact same kind of bias, and are always in alignment with each other.
The common wisdom in hiring is to find the currently most competent candidate. MLC is well positioned to protest this idea: we want instead to identify the most undervalued. We want to build a place for people with high potential but low opportunity access. This should be the role non-profits play, since we are not here for winning the capitalistic game, but for the betterment of people’s experiences. Personally, I also feel particularly drawn to this kind of people (likely because of my dad, who is a genius at heart but was deprived of education at an early age due to poverty), and I really want MLC, and my own life, to be at their service. There’s a special appeal they have — they are never pompous, loud, or assuming, and they are almost readily grateful of any act of kindness coming their way. As to their talent, it is almost a kind that’s unseen in traditional environments. The best word I could think of to describe it is “raw.” Sometimes they can be a bit of a rough diamond, and fascinatingly, rough in all different, almost unpredictable ways.
Exactly because they haven’t been “standardized” by the factory of common opportunities, many of their qualities might at first throw you off, only to then quickly shine through as endearing. For example, someone could be really good at coming up with new ideas but bad at reading existing algorithms; someone would be good at every step of conducting research but does not answer emails, or never shows up to a scheduled meeting. They are not “all-around” yet (otherwise they probably wouldn’t have come to us), but they do possess such a high potential that I could never look away once I lay my eyes on them.
On the surface it seems like a no-brainer: one is certainly more correct, more progressive, and hence widely advocated. But we would be lying if we don’t admit that the other seems more attractive and supplies the sense of specialty we all desperately want. We seem to live in a pretentious world where every public announcement showcases something inclusive, but every exciting invitation that lands in your mailbox hints that it’s exclusively just for you.
I don’t think either of them alone is the right answer, and am therefore protesting any narrative that highlights only one of them. And I happen to have a perfect answer: it is both, in an orderly process — you have to be inclusive from the start, so that you can arrive at exclusive in the end.
Why is the end goal of something always exclusive? Because a small, concentrated team with explicit commitment typically delivers the most satisfying work experience and outcome. But why does the starting point have to always be inclusive? Because I don’t think we should trust that any of us knows how to “select” the right members to include. Instead, open the door as wide as you can, and give everyone who wants a seat a chance to prove that they deserve one.
Inclusivity should be a requirement, a bar we all try to meet from the very start, whether it is about entrance to an event, an organization, a project or any other kind of team effort. Don’t worry about the chaos ensued from opening a door too wide. Expect the chaos, embrace the messiness, and trust that it will eventually lead to something exclusively special. Because it is a foregone conclusion that energy will fade, and people will leave — and those that stay are the ones that you truly would’ve wanted to select in the first place. But you might have missed them, because we all unconsciously include our judgements and biases in an active selection process (like recruiting), and being inclusive is the best way to mitigate those biases.
The downside of this process is that it’s not particularly efficient. “Allowing time to work its wonder” means that you’d have to patiently wait, and observe. Such a mentality goes against the tide where the fast-paced, efficiency-first, competitive mindset is hailed all over the tech world, but that is exactly why I’d like to advocate for it.
The more we see recent machine learning wins coming from scaling (often with regards to the model size, data size, and compute budget, but I also want to mention the “team size,” as, again, we shouldn’t forget the human effort behind projects, and there’s definitely a visible scaling effect there), the more we worry about the exclusivity of resources, and scarcity of opportunities at such a scale. Many seem to believe that anywhere that’s not a “top industrial lab” will soon have no place to compete in the race.
This might be true, if your sole purpose of research is to produce the “next shiniest thing” and to “win the race.” I do see this capitalistic side of research (especially for overhyped research areas), but I refuse to accept that’s all of it. If you see research as an endeavor to lift humans, small or medium scale research problems are undoubtedly the most fertile ground. There’s only one way to scale, but there are endless possibilities for other types of research.
I almost only work on small-scale problems, not only due to a lack of opportune timing to join the handful of large-scale, high-potential projects, but also a personal choice. Small scale is more fun: there are vastly more of them, they are much easier to iterate, and ironically, they are much more scalable — as in, you can easily work with a good number of collaborators and teams at the same time.
Most importantly, small scale research problems are the best ones to get people started: if you have never done ML research and are looking for a way in, pick a small idea and own it, from prototyping all the way to full experimentation. On large-scale projects each team member tends to be highly specialized, as that is the most efficient way to systemize a group effort. As a result one often learns a specific skill as opposed to seeing the whole picture. It’s the difference between joining a startup and a big corporation. In that world we recognize that there’s always a trade-off, and in no way is one absolutely better than the other. But in the machine learning scaling world we seem to overstate the benefits of one much more than the other.
I’ll even go out on a limb to say that, more often than not, with some kind of normalization on impact, small-scale projects demonstrate more of the authors’ talent, whereas in large-scale projects, at least some authors are there because of their privilege (not that they didn’t contribute or are deprived of talent; those are orthogonal factors). Think from the lens of what contributing to a project means to a person’s growth, in fundamental ways, and do not let sheer vanity or insecurity dictate what you work on. Then and only then, you will learn to upweight small-scale projects and ideas.
Is there a middle ground that offers the best of both worlds? Yes. (And no, the answer is not mid-scale). It is a large scale of of small-scale problems. Think about this: there can only be a small scale of large-scale problems, but there is a large scale of small-scale problems! (Take a pause here to digest if needed.) BIG-bench, an openly collaborative project that’s been going on for two years, is a great example of this. It evaluates language models of various sizes and characteristics on crowd-sourced language tasks. Each task can be considered a small-scale problem, while the whole effort is undoubtedly large-scale. Read the newly released BIG-bench paper for findings. We need more of this kind of effort in science.
This post is taking longer to write than I thought (I started mid April), and I realized that more than the writing itself, it is the periodical uprising of negative emotions, including self-doubt, self-dismissal, pessimism, throughout the writing that I had to deal with, or escape from, that got in the way of progress. I wonder why this piece in particular (and last year’s piece, which I remembered to be also rather difficult to produce), among all my other forms of writing, evokes difficult emotions. I think the answer is that writing this article requires me to take on a role that I am not familiar with.
The role of a leader, preacher, or simply an uplifting speaker that seems to be required for this kind of article, is new to me. I am so used to writing in a self-pitying voice (31,000 words and counting). Even though both being uplifting and downbeat are valid parts of my character, in writing I’m more comfortable living inside one than the other. But for this exact reason, I want, and need this kind of exercise, slightly uncomfortable and out-of-character, but aligned with how I ultimately want to be. More than just writing, I want a whole collection of “tiny difficult things” in both my personal and professional life. Because deep down I believe the right degree of discomfort helps us grow the most.
All the protests above are my tiny difficult things. None of them individually is in itself a grandiose action to overturn any major enemy, but each of them is in a small way trying to change the status quo. None of them, in action, incurs a huge resistance to overcome or consumes all my strength to carry out, but all of them are at least one or two standard deviations away from the task difficulty norm in life. Because they are tiny, if we win any of them we won’t be accoladed as a hero of all time, but keep pushing for wins will generate a stream of satisfaction enough to keep us going. And I should say that it also permits a stream of chances of failing, but isn’t that a blessing too? The only reason I’m here, without exaggeration, is because I have failed, many times.
I am invested in this alternative way of living now; I’m in too deep to leave. If you are also finding yourself “too different” to fit in any mainstream lifestyle, you are welcome to join me, and try out building your own set of tiny difficult things.
I don’t believe in one big act to save the world; I am willing to bet on small but palpable, noisy but purposeful, difficult but beneficial movements that will collectively guide us to a better place. (The same way we trust stochastic gradient descent.) And the best thing about those small acts, is that we can take on one right here and now, without taking a big pledge. How about, for example, start a conversation with that collaborator of yours who’s been a little difficult to work with, because you know that, deep down, you care about them and want them to succeed?
I’ll end with a relevant quote: “Everyone loves the big idea that will change the world. But what about the small idea that makes a difference?” (From Samuel Arbesman’s tweet.) And I will further add: “What about, a bunch of small ideas that altogether, stochastically but determinedly, make a difference?”
Thanks to Andrey Kurenkov, Shreya Shankar, Joel Lehman, and Gautam Kamath for generous edits, feedback and discussions throughout the making of this post.
PS: Some content is covered in a recent The Gradient Podcast episode.